Generate non-obfuscated binary content for PDF files

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite












When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.




n×å9.â^Ñäùàɨ•” HTÏ•ì#ò Ž–}q”mäŠ9ÒrbtRšá™g—û}Açú¦nƒÖ…‡­”jKœˆ FàƵÀmþåá•N:º‚~éWF¶DX‹m#‚D˜Àm;Ñum?OŠÀÊ¢ßÎ[ÈuóõÄ÷;Ý6"-@pñäÙ(ÖXÜÕËaœyýûdRìørêÑbβ(n^Øþ2Öƒ;¬÷ª»¦Òv0þ®±úßY'°³½‹%…ߥºíúŸKåÒ춶êæñÕ_–áúª –ò1üj9¶,Ö×VæY¼wæ¬Döð}]




Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?



This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.










share|improve this question

















  • 2




    Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
    – TeXnician
    Aug 29 at 9:22











  • I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
    – Alexandru Irimiea
    Aug 29 at 22:12














up vote
4
down vote

favorite












When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.




n×å9.â^Ñäùàɨ•” HTÏ•ì#ò Ž–}q”mäŠ9ÒrbtRšá™g—û}Açú¦nƒÖ…‡­”jKœˆ FàƵÀmþåá•N:º‚~éWF¶DX‹m#‚D˜Àm;Ñum?OŠÀÊ¢ßÎ[ÈuóõÄ÷;Ý6"-@pñäÙ(ÖXÜÕËaœyýûdRìørêÑbβ(n^Øþ2Öƒ;¬÷ª»¦Òv0þ®±úßY'°³½‹%…ߥºíúŸKåÒ춶êæñÕ_–áúª –ò1üj9¶,Ö×VæY¼wæ¬Döð}]




Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?



This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.










share|improve this question

















  • 2




    Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
    – TeXnician
    Aug 29 at 9:22











  • I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
    – Alexandru Irimiea
    Aug 29 at 22:12












up vote
4
down vote

favorite









up vote
4
down vote

favorite











When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.




n×å9.â^Ñäùàɨ•” HTÏ•ì#ò Ž–}q”mäŠ9ÒrbtRšá™g—û}Açú¦nƒÖ…‡­”jKœˆ FàƵÀmþåá•N:º‚~éWF¶DX‹m#‚D˜Àm;Ñum?OŠÀÊ¢ßÎ[ÈuóõÄ÷;Ý6"-@pñäÙ(ÖXÜÕËaœyýûdRìørêÑbβ(n^Øþ2Öƒ;¬÷ª»¦Òv0þ®±úßY'°³½‹%…ߥºíúŸKåÒ춶êæñÕ_–áúª –ò1üj9¶,Ö×VæY¼wæ¬Döð}]




Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?



This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.










share|improve this question













When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.




n×å9.â^Ñäùàɨ•” HTÏ•ì#ò Ž–}q”mäŠ9ÒrbtRšá™g—û}Açú¦nƒÖ…‡­”jKœˆ FàƵÀmþåá•N:º‚~éWF¶DX‹m#‚D˜Àm;Ñum?OŠÀÊ¢ßÎ[ÈuóõÄ÷;Ý6"-@pñäÙ(ÖXÜÕËaœyýûdRìørêÑbβ(n^Øþ2Öƒ;¬÷ª»¦Òv0þ®±úßY'°³½‹%…ߥºíúŸKåÒ춶êæñÕ_–áúª –ò1üj9¶,Ö×VæY¼wæ¬Döð}]




Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?



This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.







pdf






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Aug 29 at 9:20









Alexandru Irimiea

1233




1233







  • 2




    Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
    – TeXnician
    Aug 29 at 9:22











  • I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
    – Alexandru Irimiea
    Aug 29 at 22:12












  • 2




    Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
    – TeXnician
    Aug 29 at 9:22











  • I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
    – Alexandru Irimiea
    Aug 29 at 22:12







2




2




Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
– TeXnician
Aug 29 at 9:22





Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
– TeXnician
Aug 29 at 9:22













I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
– Alexandru Irimiea
Aug 29 at 22:12




I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
– Alexandru Irimiea
Aug 29 at 22:12










1 Answer
1






active

oldest

votes

















up vote
8
down vote



accepted










For pdfTeX



pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %


For LuaTeX



pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %


For use with (x)dvipdfmx (XeTeX, upTeX, etc.)



specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40





share|improve this answer
















  • 1




    Probably in the near future I'll add an interface for this to expl3.
    – Joseph Wright♦
    Aug 29 at 9:25






  • 1




    +1, but I wonder whether a crawler that doesn't support PDF compression likes BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET…
    – TeXnician
    Aug 29 at 9:27










  • @TeXnician Sure, we can't do that much about that!
    – Joseph Wright♦
    Aug 29 at 9:28






  • 3




    @AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
    – Joseph Wright♦
    Aug 29 at 10:14










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "85"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f448296%2fgenerate-non-obfuscated-binary-content-for-pdf-files%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
8
down vote



accepted










For pdfTeX



pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %


For LuaTeX



pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %


For use with (x)dvipdfmx (XeTeX, upTeX, etc.)



specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40





share|improve this answer
















  • 1




    Probably in the near future I'll add an interface for this to expl3.
    – Joseph Wright♦
    Aug 29 at 9:25






  • 1




    +1, but I wonder whether a crawler that doesn't support PDF compression likes BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET…
    – TeXnician
    Aug 29 at 9:27










  • @TeXnician Sure, we can't do that much about that!
    – Joseph Wright♦
    Aug 29 at 9:28






  • 3




    @AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
    – Joseph Wright♦
    Aug 29 at 10:14














up vote
8
down vote



accepted










For pdfTeX



pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %


For LuaTeX



pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %


For use with (x)dvipdfmx (XeTeX, upTeX, etc.)



specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40





share|improve this answer
















  • 1




    Probably in the near future I'll add an interface for this to expl3.
    – Joseph Wright♦
    Aug 29 at 9:25






  • 1




    +1, but I wonder whether a crawler that doesn't support PDF compression likes BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET…
    – TeXnician
    Aug 29 at 9:27










  • @TeXnician Sure, we can't do that much about that!
    – Joseph Wright♦
    Aug 29 at 9:28






  • 3




    @AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
    – Joseph Wright♦
    Aug 29 at 10:14












up vote
8
down vote



accepted







up vote
8
down vote



accepted






For pdfTeX



pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %


For LuaTeX



pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %


For use with (x)dvipdfmx (XeTeX, upTeX, etc.)



specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40





share|improve this answer












For pdfTeX



pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %


For LuaTeX



pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %


For use with (x)dvipdfmx (XeTeX, upTeX, etc.)



specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40






share|improve this answer












share|improve this answer



share|improve this answer










answered Aug 29 at 9:25









Joseph Wright♦

197k21543863




197k21543863







  • 1




    Probably in the near future I'll add an interface for this to expl3.
    – Joseph Wright♦
    Aug 29 at 9:25






  • 1




    +1, but I wonder whether a crawler that doesn't support PDF compression likes BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET…
    – TeXnician
    Aug 29 at 9:27










  • @TeXnician Sure, we can't do that much about that!
    – Joseph Wright♦
    Aug 29 at 9:28






  • 3




    @AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
    – Joseph Wright♦
    Aug 29 at 10:14












  • 1




    Probably in the near future I'll add an interface for this to expl3.
    – Joseph Wright♦
    Aug 29 at 9:25






  • 1




    +1, but I wonder whether a crawler that doesn't support PDF compression likes BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET…
    – TeXnician
    Aug 29 at 9:27










  • @TeXnician Sure, we can't do that much about that!
    – Joseph Wright♦
    Aug 29 at 9:28






  • 3




    @AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
    – Joseph Wright♦
    Aug 29 at 10:14







1




1




Probably in the near future I'll add an interface for this to expl3.
– Joseph Wright♦
Aug 29 at 9:25




Probably in the near future I'll add an interface for this to expl3.
– Joseph Wright♦
Aug 29 at 9:25




1




1




+1, but I wonder whether a crawler that doesn't support PDF compression likes BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET…
– TeXnician
Aug 29 at 9:27




+1, but I wonder whether a crawler that doesn't support PDF compression likes BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET…
– TeXnician
Aug 29 at 9:27












@TeXnician Sure, we can't do that much about that!
– Joseph Wright♦
Aug 29 at 9:28




@TeXnician Sure, we can't do that much about that!
– Joseph Wright♦
Aug 29 at 9:28




3




3




@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
– Joseph Wright♦
Aug 29 at 10:14




@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
– Joseph Wright♦
Aug 29 at 10:14

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f448296%2fgenerate-non-obfuscated-binary-content-for-pdf-files%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Bahrain

Postfix configuration issue with fips on centos 7; mailgun relay