Editing of files in PDF format is difficult. Compared with DOCX format, what is the significance of its existence?

2022-08-04 11:28

First, Word is not a format, docx is the format. PDFs don't exist by themselves to let you edit them casually.



Think about it, the effect of opening the same .docx file on different .docx processors (such as MS Word, WPS, Libertine Office) may be different, and the effect of opening on different versions of the same .docx processor. Also different (like Word 2003 and Word 2019). This also means that .docx itself is not born to output a stable file. The use scenario of .docx is mainly to quickly output some short documents with good effect. But the PDF format itself was created for the pursuit of stability. In other words, one of the purposes of PDF is to let you open the same PDF file on different hardware and software, and the effect you see is exactly the same.


Think about it, what is the full name of PDF? Portable Document Format. This is not difficult to understand. Talk about the difference between the two formats. .docx is essentially XML, but PDF is PDF. Each paragraph of .docx can actually be regarded as a line, and the content of this line is disconnected by the greedy method and divided into several lines to form a paragraph. But PDF is different, a line is a line, if you directly copy a section of content in the PDF into Word, you will find that there will be an end-of-paragraph mark after each line. In fact, in PDF, there is no concept of paragraph. Some are just the position of the line and the relative distance between words within each line. This is one of the reasons why PDFs are difficult to edit.

Fonts are also a big problem. The fonts in .docx are usually not inline, but look for fonts directly from the system. But the vast majority of fonts in PDF documents are embedded in the PDF in binary form. Document authors who don't embed fonts in PDFs can basically be considered idle. The meaning of PDF's existence is stability. Various typographic details are stably recorded within the PDF. So, don't try to "edit" a PDF file. The notes you add to the PDF actually create a new PDF, not directly modify the original PDF.
