Website conversion to text
Why are websites converted to text documents
Why are websites converted to text documents
Simply put, all websites are a combination of HTML markup and Cascading Style Sheet syntax in text form that the web browser presents in a human visually consumable format.
Some tools can strip away that markup source syntax just leaving plain text, but it may be disjoint and require further processing to resemble ordinary text to place in other documents.
Apple's Safari browser can save a web page as source markup, or as a binary Webarchive. Among the tools in macOS and accessible from the Terminal is the textutil(1) utility. When I save this post in Safari 18.4 as a webarchive to the Desktop, I can extract just the text from it using the following syntax in the Terminal:
textutil -format webarchive -convert txt -inputencoding NSUTF8StringEncoding -output foo.txt foo.webarchive
The result is not satisfying as any text that you see in the threads of this post is written to foo.txt in white-space separated instances.
@Charles Palenz
Safari does have a Show Page Source on its optional Developer menu. That is enabled in Safari Settings > Advanced tab at the bottom of that panel: Show features for web developers.
My rule of thumb is never to attempt opening any structured code in TextEdit regardless of its Plain text option. That is why there is the free BBEdit or more advanced programmer's editors with syntax awareness for this purpose.
LED84 wrote:
Why are websites converted to text documents
Please provide some details and context. Websites are text documents.
when you are on a web page and you save that page and then desire to view later, you must be double clicking on it and are getting the source code which looks like text. All that text, indents, left and right arrows and slashes, etc represents code that the browser translates to the images and text you see in the browser window. There is no feature in Safari to view a page in source code, but some text programs (Textmate) will open a html page as source code, Apple's text editor opens as RTF on my computer and I haven't figure out to not do that. If you have a html document on your desktop or folder, drag it to the safari app and it will open as a normal web page. If you are double clicking on the html document, your text program must be opening and you see the source coding, aka text.
VikingOSX and Tom Gewecke:
Your remarks can be appreciated by me, but has nothing to do with LED84 query and are distractions and not properly part of this thread. Trying to understand what LED84 experienced and explain what happened, is the purpose of this thread. BBedit is a program I use and unlikely what LED84 has and text editors do show the source code which is what our friend almost certainly is seeing and mistaking as a file conversion. Hopefully my explanation will help and if not bring a response that we can help bring clarity to LED84.
LED84 wrote:
Why are websites converted to text documents
Provide a URL and a screenshot. No idea what you are referring to.
Charles Palenz wrote:
Apple's text editor opens as RTF on my computer and I haven't figure out to not do that.
TextEdit > Settings > New Document > Format
Website conversion to text