Website conversion to text

Why are websites converted to text documents


Posted on Apr 6, 2025 4:52 PM

Reply
7 replies
Sort By: 

Apr 7, 2025 8:21 AM in response to LED84

Simply put, all websites are a combination of HTML markup and Cascading Style Sheet syntax in text form that the web browser presents in a human visually consumable format.


Some tools can strip away that markup source syntax just leaving plain text, but it may be disjoint and require further processing to resemble ordinary text to place in other documents.


Apple's Safari browser can save a web page as source markup, or as a binary Webarchive. Among the tools in macOS and accessible from the Terminal is the textutil(1) utility. When I save this post in Safari 18.4 as a webarchive to the Desktop, I can extract just the text from it using the following syntax in the Terminal:

textutil -format webarchive -convert txt -inputencoding NSUTF8StringEncoding -output foo.txt foo.webarchive


The result is not satisfying as any text that you see in the threads of this post is written to foo.txt in white-space separated instances.


@Charles Palenz


Safari does have a Show Page Source on its optional Developer menu. That is enabled in Safari Settings > Advanced tab at the bottom of that panel: Show features for web developers.

Reply

Apr 7, 2025 8:00 AM in response to LED84

when you are on a web page and you save that page and then desire to view later, you must be double clicking on it and are getting the source code which looks like text. All that text, indents, left and right arrows and slashes, etc represents code that the browser translates to the images and text you see in the browser window. There is no feature in Safari to view a page in source code, but some text programs (Textmate) will open a html page as source code, Apple's text editor opens as RTF on my computer and I haven't figure out to not do that. If you have a html document on your desktop or folder, drag it to the safari app and it will open as a normal web page. If you are double clicking on the html document, your text program must be opening and you see the source coding, aka text.

Reply

Apr 7, 2025 11:17 AM in response to VikingOSX

VikingOSX and Tom Gewecke:

Your remarks can be appreciated by me, but has nothing to do with LED84 query and are distractions and not properly part of this thread. Trying to understand what LED84 experienced and explain what happened, is the purpose of this thread. BBedit is a program I use and unlikely what LED84 has and text editors do show the source code which is what our friend almost certainly is seeing and mistaking as a file conversion. Hopefully my explanation will help and if not bring a response that we can help bring clarity to LED84.

Reply

Website conversion to text

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.