Uniquifying PDF for traceability
Clash Royale CLAN TAG#URR8PPP
up vote
10
down vote
favorite
I am planning to send out a confidential document to a number of people. Each of them will have to sign a non-disclosure agreement. Yet, maybe the document will find its way to the internet. If I happen to come across the document, I want to be able to identify who's made it available on the internet so I can contact the person.
How would I embed some sort of invisible signature that I can verify on a published version? I don't want the user to go through the hoops of entering a password. I also don't want to embed a clearly identifiable name. I'd just create a spreadsheet with names and associated embedded signature for later identification purposes.
I understand nothing's foolproof and someone could just copy the contents, rewrite it or whatever. I just want to increase the hurdle by embedding some sort of "silent" code in the hopes the infringing party was careless enough to just make the document available as is.
I would prefer to use some simple LaTeX code, accessing something like pdfinfo
, rather than using a third-party tool.
Any suggestions?
pdftex watermark
add a comment |Â
up vote
10
down vote
favorite
I am planning to send out a confidential document to a number of people. Each of them will have to sign a non-disclosure agreement. Yet, maybe the document will find its way to the internet. If I happen to come across the document, I want to be able to identify who's made it available on the internet so I can contact the person.
How would I embed some sort of invisible signature that I can verify on a published version? I don't want the user to go through the hoops of entering a password. I also don't want to embed a clearly identifiable name. I'd just create a spreadsheet with names and associated embedded signature for later identification purposes.
I understand nothing's foolproof and someone could just copy the contents, rewrite it or whatever. I just want to increase the hurdle by embedding some sort of "silent" code in the hopes the infringing party was careless enough to just make the document available as is.
I would prefer to use some simple LaTeX code, accessing something like pdfinfo
, rather than using a third-party tool.
Any suggestions?
pdftex watermark
1
Wouldn't a watermark be more efficient: It clarifies that the document is not for the eyes of the public, you can make it unique and it stays even when printing (as an example)?
â TeXnician
Aug 23 at 19:53
Somewhat related: tex.stackexchange.com/a/440271/134574
â Phelype Oleinik
Aug 23 at 19:54
3
Make a few minor spelling mistakes, grammar mistakes or font changes. This will foil photographing or scanning. The first two will foil simple cutting and pasting.
â Theodore Norvell
Aug 24 at 0:23
add a comment |Â
up vote
10
down vote
favorite
up vote
10
down vote
favorite
I am planning to send out a confidential document to a number of people. Each of them will have to sign a non-disclosure agreement. Yet, maybe the document will find its way to the internet. If I happen to come across the document, I want to be able to identify who's made it available on the internet so I can contact the person.
How would I embed some sort of invisible signature that I can verify on a published version? I don't want the user to go through the hoops of entering a password. I also don't want to embed a clearly identifiable name. I'd just create a spreadsheet with names and associated embedded signature for later identification purposes.
I understand nothing's foolproof and someone could just copy the contents, rewrite it or whatever. I just want to increase the hurdle by embedding some sort of "silent" code in the hopes the infringing party was careless enough to just make the document available as is.
I would prefer to use some simple LaTeX code, accessing something like pdfinfo
, rather than using a third-party tool.
Any suggestions?
pdftex watermark
I am planning to send out a confidential document to a number of people. Each of them will have to sign a non-disclosure agreement. Yet, maybe the document will find its way to the internet. If I happen to come across the document, I want to be able to identify who's made it available on the internet so I can contact the person.
How would I embed some sort of invisible signature that I can verify on a published version? I don't want the user to go through the hoops of entering a password. I also don't want to embed a clearly identifiable name. I'd just create a spreadsheet with names and associated embedded signature for later identification purposes.
I understand nothing's foolproof and someone could just copy the contents, rewrite it or whatever. I just want to increase the hurdle by embedding some sort of "silent" code in the hopes the infringing party was careless enough to just make the document available as is.
I would prefer to use some simple LaTeX code, accessing something like pdfinfo
, rather than using a third-party tool.
Any suggestions?
pdftex watermark
pdftex watermark
edited Aug 24 at 7:44
Martin Schröder
12.7k537120
12.7k537120
asked Aug 23 at 19:29
Hansel
895
895
1
Wouldn't a watermark be more efficient: It clarifies that the document is not for the eyes of the public, you can make it unique and it stays even when printing (as an example)?
â TeXnician
Aug 23 at 19:53
Somewhat related: tex.stackexchange.com/a/440271/134574
â Phelype Oleinik
Aug 23 at 19:54
3
Make a few minor spelling mistakes, grammar mistakes or font changes. This will foil photographing or scanning. The first two will foil simple cutting and pasting.
â Theodore Norvell
Aug 24 at 0:23
add a comment |Â
1
Wouldn't a watermark be more efficient: It clarifies that the document is not for the eyes of the public, you can make it unique and it stays even when printing (as an example)?
â TeXnician
Aug 23 at 19:53
Somewhat related: tex.stackexchange.com/a/440271/134574
â Phelype Oleinik
Aug 23 at 19:54
3
Make a few minor spelling mistakes, grammar mistakes or font changes. This will foil photographing or scanning. The first two will foil simple cutting and pasting.
â Theodore Norvell
Aug 24 at 0:23
1
1
Wouldn't a watermark be more efficient: It clarifies that the document is not for the eyes of the public, you can make it unique and it stays even when printing (as an example)?
â TeXnician
Aug 23 at 19:53
Wouldn't a watermark be more efficient: It clarifies that the document is not for the eyes of the public, you can make it unique and it stays even when printing (as an example)?
â TeXnician
Aug 23 at 19:53
Somewhat related: tex.stackexchange.com/a/440271/134574
â Phelype Oleinik
Aug 23 at 19:54
Somewhat related: tex.stackexchange.com/a/440271/134574
â Phelype Oleinik
Aug 23 at 19:54
3
3
Make a few minor spelling mistakes, grammar mistakes or font changes. This will foil photographing or scanning. The first two will foil simple cutting and pasting.
â Theodore Norvell
Aug 24 at 0:23
Make a few minor spelling mistakes, grammar mistakes or font changes. This will foil photographing or scanning. The first two will foil simple cutting and pasting.
â Theodore Norvell
Aug 24 at 0:23
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
12
down vote
accepted
You do ot need any LaTeX code: Every PDF file is supposed to have a unique ID, to see it just open the PDF file in a text editor and at the end of the file you find a line like this(with a different ID of course):
/ID [<B825FFAF5C24E0EBBF2E5D369546DC86> <B825FFAF5C24E0EBBF2E5D369546DC86>]
If you recompile your file, the ID changes.
You only care about the first ID, the second one is for tracking changes to the file: Here the secind is equal to the first, so we have the original file. A program which changes the file is supposed to change the second ID and keep the first one intact.
Of course, this might be too obvious. It is easy for the receiver of the document to change these IDs.
You can also just write down the CreationDate (you can find it for example with pdfinfo
). If you create a seperate file for each person, the will probably differ by a second or more.
2
+1, and then the person prints the document and uploads a scanned version ;)
â TeXnician
Aug 23 at 19:54
@TeXnician Nobody would upload a scanned version anymore, most people would photograph the document with their phonesðÂÂÂ
â Marcel Krüger
Aug 23 at 20:00
Ooh, that's sweet and so simple. That's all I need. Very easy to use. Thank you very much!
â Hansel
Aug 23 at 20:05
add a comment |Â
up vote
2
down vote
Nobody in his right mind simply publishes a pdf as it is in the internet after signing a NDA. So how to find out, if the receiver of your PDF thinks twice before publishing?
The idea of an ID of any PDF doesn't help, because the ID changes completely if you send the PDF-file through a PDF-printer as pdf24 eg.
You have to produce individual PDFs for each receiver, which differ on every page. The easiest way to do that is to change the font or the fontsize and this will change a lot of line breaks on each page. The best way IMO is to have individual texts.
Clever receivers might just copy the text out of your pdf. You might produce a sandwich pdf and have a typo in the invisible background text, which differs only a little from the typo in the visible text.
Besides that, you can produce very indivually looking PDFs with the very nice luatex-package chickenize. The option randomcolor_grey might help you.
There is much more what you can do: use eso-pic for a dot-matrix in the background and so on. Countermeasure: take pictures with your mobile from the screen and run them through an OCR.
However, can't you just change the critical part of the text a little bit for each individual PDF?
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
12
down vote
accepted
You do ot need any LaTeX code: Every PDF file is supposed to have a unique ID, to see it just open the PDF file in a text editor and at the end of the file you find a line like this(with a different ID of course):
/ID [<B825FFAF5C24E0EBBF2E5D369546DC86> <B825FFAF5C24E0EBBF2E5D369546DC86>]
If you recompile your file, the ID changes.
You only care about the first ID, the second one is for tracking changes to the file: Here the secind is equal to the first, so we have the original file. A program which changes the file is supposed to change the second ID and keep the first one intact.
Of course, this might be too obvious. It is easy for the receiver of the document to change these IDs.
You can also just write down the CreationDate (you can find it for example with pdfinfo
). If you create a seperate file for each person, the will probably differ by a second or more.
2
+1, and then the person prints the document and uploads a scanned version ;)
â TeXnician
Aug 23 at 19:54
@TeXnician Nobody would upload a scanned version anymore, most people would photograph the document with their phonesðÂÂÂ
â Marcel Krüger
Aug 23 at 20:00
Ooh, that's sweet and so simple. That's all I need. Very easy to use. Thank you very much!
â Hansel
Aug 23 at 20:05
add a comment |Â
up vote
12
down vote
accepted
You do ot need any LaTeX code: Every PDF file is supposed to have a unique ID, to see it just open the PDF file in a text editor and at the end of the file you find a line like this(with a different ID of course):
/ID [<B825FFAF5C24E0EBBF2E5D369546DC86> <B825FFAF5C24E0EBBF2E5D369546DC86>]
If you recompile your file, the ID changes.
You only care about the first ID, the second one is for tracking changes to the file: Here the secind is equal to the first, so we have the original file. A program which changes the file is supposed to change the second ID and keep the first one intact.
Of course, this might be too obvious. It is easy for the receiver of the document to change these IDs.
You can also just write down the CreationDate (you can find it for example with pdfinfo
). If you create a seperate file for each person, the will probably differ by a second or more.
2
+1, and then the person prints the document and uploads a scanned version ;)
â TeXnician
Aug 23 at 19:54
@TeXnician Nobody would upload a scanned version anymore, most people would photograph the document with their phonesðÂÂÂ
â Marcel Krüger
Aug 23 at 20:00
Ooh, that's sweet and so simple. That's all I need. Very easy to use. Thank you very much!
â Hansel
Aug 23 at 20:05
add a comment |Â
up vote
12
down vote
accepted
up vote
12
down vote
accepted
You do ot need any LaTeX code: Every PDF file is supposed to have a unique ID, to see it just open the PDF file in a text editor and at the end of the file you find a line like this(with a different ID of course):
/ID [<B825FFAF5C24E0EBBF2E5D369546DC86> <B825FFAF5C24E0EBBF2E5D369546DC86>]
If you recompile your file, the ID changes.
You only care about the first ID, the second one is for tracking changes to the file: Here the secind is equal to the first, so we have the original file. A program which changes the file is supposed to change the second ID and keep the first one intact.
Of course, this might be too obvious. It is easy for the receiver of the document to change these IDs.
You can also just write down the CreationDate (you can find it for example with pdfinfo
). If you create a seperate file for each person, the will probably differ by a second or more.
You do ot need any LaTeX code: Every PDF file is supposed to have a unique ID, to see it just open the PDF file in a text editor and at the end of the file you find a line like this(with a different ID of course):
/ID [<B825FFAF5C24E0EBBF2E5D369546DC86> <B825FFAF5C24E0EBBF2E5D369546DC86>]
If you recompile your file, the ID changes.
You only care about the first ID, the second one is for tracking changes to the file: Here the secind is equal to the first, so we have the original file. A program which changes the file is supposed to change the second ID and keep the first one intact.
Of course, this might be too obvious. It is easy for the receiver of the document to change these IDs.
You can also just write down the CreationDate (you can find it for example with pdfinfo
). If you create a seperate file for each person, the will probably differ by a second or more.
answered Aug 23 at 19:51
Marcel Krüger
10.6k11233
10.6k11233
2
+1, and then the person prints the document and uploads a scanned version ;)
â TeXnician
Aug 23 at 19:54
@TeXnician Nobody would upload a scanned version anymore, most people would photograph the document with their phonesðÂÂÂ
â Marcel Krüger
Aug 23 at 20:00
Ooh, that's sweet and so simple. That's all I need. Very easy to use. Thank you very much!
â Hansel
Aug 23 at 20:05
add a comment |Â
2
+1, and then the person prints the document and uploads a scanned version ;)
â TeXnician
Aug 23 at 19:54
@TeXnician Nobody would upload a scanned version anymore, most people would photograph the document with their phonesðÂÂÂ
â Marcel Krüger
Aug 23 at 20:00
Ooh, that's sweet and so simple. That's all I need. Very easy to use. Thank you very much!
â Hansel
Aug 23 at 20:05
2
2
+1, and then the person prints the document and uploads a scanned version ;)
â TeXnician
Aug 23 at 19:54
+1, and then the person prints the document and uploads a scanned version ;)
â TeXnician
Aug 23 at 19:54
@TeXnician Nobody would upload a scanned version anymore, most people would photograph the document with their phonesðÂÂÂ
â Marcel Krüger
Aug 23 at 20:00
@TeXnician Nobody would upload a scanned version anymore, most people would photograph the document with their phonesðÂÂÂ
â Marcel Krüger
Aug 23 at 20:00
Ooh, that's sweet and so simple. That's all I need. Very easy to use. Thank you very much!
â Hansel
Aug 23 at 20:05
Ooh, that's sweet and so simple. That's all I need. Very easy to use. Thank you very much!
â Hansel
Aug 23 at 20:05
add a comment |Â
up vote
2
down vote
Nobody in his right mind simply publishes a pdf as it is in the internet after signing a NDA. So how to find out, if the receiver of your PDF thinks twice before publishing?
The idea of an ID of any PDF doesn't help, because the ID changes completely if you send the PDF-file through a PDF-printer as pdf24 eg.
You have to produce individual PDFs for each receiver, which differ on every page. The easiest way to do that is to change the font or the fontsize and this will change a lot of line breaks on each page. The best way IMO is to have individual texts.
Clever receivers might just copy the text out of your pdf. You might produce a sandwich pdf and have a typo in the invisible background text, which differs only a little from the typo in the visible text.
Besides that, you can produce very indivually looking PDFs with the very nice luatex-package chickenize. The option randomcolor_grey might help you.
There is much more what you can do: use eso-pic for a dot-matrix in the background and so on. Countermeasure: take pictures with your mobile from the screen and run them through an OCR.
However, can't you just change the critical part of the text a little bit for each individual PDF?
add a comment |Â
up vote
2
down vote
Nobody in his right mind simply publishes a pdf as it is in the internet after signing a NDA. So how to find out, if the receiver of your PDF thinks twice before publishing?
The idea of an ID of any PDF doesn't help, because the ID changes completely if you send the PDF-file through a PDF-printer as pdf24 eg.
You have to produce individual PDFs for each receiver, which differ on every page. The easiest way to do that is to change the font or the fontsize and this will change a lot of line breaks on each page. The best way IMO is to have individual texts.
Clever receivers might just copy the text out of your pdf. You might produce a sandwich pdf and have a typo in the invisible background text, which differs only a little from the typo in the visible text.
Besides that, you can produce very indivually looking PDFs with the very nice luatex-package chickenize. The option randomcolor_grey might help you.
There is much more what you can do: use eso-pic for a dot-matrix in the background and so on. Countermeasure: take pictures with your mobile from the screen and run them through an OCR.
However, can't you just change the critical part of the text a little bit for each individual PDF?
add a comment |Â
up vote
2
down vote
up vote
2
down vote
Nobody in his right mind simply publishes a pdf as it is in the internet after signing a NDA. So how to find out, if the receiver of your PDF thinks twice before publishing?
The idea of an ID of any PDF doesn't help, because the ID changes completely if you send the PDF-file through a PDF-printer as pdf24 eg.
You have to produce individual PDFs for each receiver, which differ on every page. The easiest way to do that is to change the font or the fontsize and this will change a lot of line breaks on each page. The best way IMO is to have individual texts.
Clever receivers might just copy the text out of your pdf. You might produce a sandwich pdf and have a typo in the invisible background text, which differs only a little from the typo in the visible text.
Besides that, you can produce very indivually looking PDFs with the very nice luatex-package chickenize. The option randomcolor_grey might help you.
There is much more what you can do: use eso-pic for a dot-matrix in the background and so on. Countermeasure: take pictures with your mobile from the screen and run them through an OCR.
However, can't you just change the critical part of the text a little bit for each individual PDF?
Nobody in his right mind simply publishes a pdf as it is in the internet after signing a NDA. So how to find out, if the receiver of your PDF thinks twice before publishing?
The idea of an ID of any PDF doesn't help, because the ID changes completely if you send the PDF-file through a PDF-printer as pdf24 eg.
You have to produce individual PDFs for each receiver, which differ on every page. The easiest way to do that is to change the font or the fontsize and this will change a lot of line breaks on each page. The best way IMO is to have individual texts.
Clever receivers might just copy the text out of your pdf. You might produce a sandwich pdf and have a typo in the invisible background text, which differs only a little from the typo in the visible text.
Besides that, you can produce very indivually looking PDFs with the very nice luatex-package chickenize. The option randomcolor_grey might help you.
There is much more what you can do: use eso-pic for a dot-matrix in the background and so on. Countermeasure: take pictures with your mobile from the screen and run them through an OCR.
However, can't you just change the critical part of the text a little bit for each individual PDF?
answered Aug 24 at 12:24
Keks Dose
20.2k35191
20.2k35191
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f447391%2funiquifying-pdf-for-traceability%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
Wouldn't a watermark be more efficient: It clarifies that the document is not for the eyes of the public, you can make it unique and it stays even when printing (as an example)?
â TeXnician
Aug 23 at 19:53
Somewhat related: tex.stackexchange.com/a/440271/134574
â Phelype Oleinik
Aug 23 at 19:54
3
Make a few minor spelling mistakes, grammar mistakes or font changes. This will foil photographing or scanning. The first two will foil simple cutting and pasting.
â Theodore Norvell
Aug 24 at 0:23