find out all the line numbers with binary non-text chars of a big log file
Clash Royale CLAN TAG#URR8PPP
I have a big log file with non text chars. I used grep to search and I got this result:
Binary file (standard input) matches
I can use grep -a to skip these line with non text chars.
Now, how can I find out all lines which contain non-text chars ?
grep
add a comment |
I have a big log file with non text chars. I used grep to search and I got this result:
Binary file (standard input) matches
I can use grep -a to skip these line with non text chars.
Now, how can I find out all lines which contain non-text chars ?
grep
add a comment |
I have a big log file with non text chars. I used grep to search and I got this result:
Binary file (standard input) matches
I can use grep -a to skip these line with non text chars.
Now, how can I find out all lines which contain non-text chars ?
grep
I have a big log file with non text chars. I used grep to search and I got this result:
Binary file (standard input) matches
I can use grep -a to skip these line with non text chars.
Now, how can I find out all lines which contain non-text chars ?
grep
grep
edited Feb 25 at 9:21
Archemar
20.3k93973
20.3k93973
asked Feb 25 at 9:15
Ken TsangKen Tsang
1
1
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
What GNU grep
considers non-text varies with the version and the locale.
In first approximation, you can try:
grep -anPe '^((?!.*$)|.*)' < file.log
That is look for the lines that contain a NUL character, 0 byte (likely to be the cause of that Binary file message if your log file was truncated while open for writing by some process without O_APPEND), or non-characters (possible if you're in a locale with a multibyte charset like UTF-8 and some lines were output in another charset).
That assumes your GNU grep
was built with PCRE support (for -P
).
You may want to pipe that output to something like sed -n l
or hexdump -C
or od -vtc -tx1
(and maybe omit the -n
option to grep
) to try and identify those byte sequences that cause the binary message.
Note that grep -a
does not skip those lines, it just tells GNU grep
to not treat files it considers as binary specially. Lines with those 0 bytes or non-characters will still be reported if they match the pattern.
On Linux at least and most native filesystems, you can tell if a file is sparse, that is has unallocated parts (holes) that would appear full of zero bytes with:
perl -le '
seek STDIN,0,4 or die; $hole = tell STDIN;
seek STDIN, $hole, 3 and $data = tell STDIN;
seek STDIN, 0, 2; $end = tell STDIN;
if ($hole != $end) ' < file.log
Holes would be created whenever the gap would otherwise include at least one full filesystem block (typically 4KiB). There would probably be more NUL bytes on either side of those hole.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f502815%2ffind-out-all-the-line-numbers-with-binary-non-text-chars-of-a-big-log-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
What GNU grep
considers non-text varies with the version and the locale.
In first approximation, you can try:
grep -anPe '^((?!.*$)|.*)' < file.log
That is look for the lines that contain a NUL character, 0 byte (likely to be the cause of that Binary file message if your log file was truncated while open for writing by some process without O_APPEND), or non-characters (possible if you're in a locale with a multibyte charset like UTF-8 and some lines were output in another charset).
That assumes your GNU grep
was built with PCRE support (for -P
).
You may want to pipe that output to something like sed -n l
or hexdump -C
or od -vtc -tx1
(and maybe omit the -n
option to grep
) to try and identify those byte sequences that cause the binary message.
Note that grep -a
does not skip those lines, it just tells GNU grep
to not treat files it considers as binary specially. Lines with those 0 bytes or non-characters will still be reported if they match the pattern.
On Linux at least and most native filesystems, you can tell if a file is sparse, that is has unallocated parts (holes) that would appear full of zero bytes with:
perl -le '
seek STDIN,0,4 or die; $hole = tell STDIN;
seek STDIN, $hole, 3 and $data = tell STDIN;
seek STDIN, 0, 2; $end = tell STDIN;
if ($hole != $end) ' < file.log
Holes would be created whenever the gap would otherwise include at least one full filesystem block (typically 4KiB). There would probably be more NUL bytes on either side of those hole.
add a comment |
What GNU grep
considers non-text varies with the version and the locale.
In first approximation, you can try:
grep -anPe '^((?!.*$)|.*)' < file.log
That is look for the lines that contain a NUL character, 0 byte (likely to be the cause of that Binary file message if your log file was truncated while open for writing by some process without O_APPEND), or non-characters (possible if you're in a locale with a multibyte charset like UTF-8 and some lines were output in another charset).
That assumes your GNU grep
was built with PCRE support (for -P
).
You may want to pipe that output to something like sed -n l
or hexdump -C
or od -vtc -tx1
(and maybe omit the -n
option to grep
) to try and identify those byte sequences that cause the binary message.
Note that grep -a
does not skip those lines, it just tells GNU grep
to not treat files it considers as binary specially. Lines with those 0 bytes or non-characters will still be reported if they match the pattern.
On Linux at least and most native filesystems, you can tell if a file is sparse, that is has unallocated parts (holes) that would appear full of zero bytes with:
perl -le '
seek STDIN,0,4 or die; $hole = tell STDIN;
seek STDIN, $hole, 3 and $data = tell STDIN;
seek STDIN, 0, 2; $end = tell STDIN;
if ($hole != $end) ' < file.log
Holes would be created whenever the gap would otherwise include at least one full filesystem block (typically 4KiB). There would probably be more NUL bytes on either side of those hole.
add a comment |
What GNU grep
considers non-text varies with the version and the locale.
In first approximation, you can try:
grep -anPe '^((?!.*$)|.*)' < file.log
That is look for the lines that contain a NUL character, 0 byte (likely to be the cause of that Binary file message if your log file was truncated while open for writing by some process without O_APPEND), or non-characters (possible if you're in a locale with a multibyte charset like UTF-8 and some lines were output in another charset).
That assumes your GNU grep
was built with PCRE support (for -P
).
You may want to pipe that output to something like sed -n l
or hexdump -C
or od -vtc -tx1
(and maybe omit the -n
option to grep
) to try and identify those byte sequences that cause the binary message.
Note that grep -a
does not skip those lines, it just tells GNU grep
to not treat files it considers as binary specially. Lines with those 0 bytes or non-characters will still be reported if they match the pattern.
On Linux at least and most native filesystems, you can tell if a file is sparse, that is has unallocated parts (holes) that would appear full of zero bytes with:
perl -le '
seek STDIN,0,4 or die; $hole = tell STDIN;
seek STDIN, $hole, 3 and $data = tell STDIN;
seek STDIN, 0, 2; $end = tell STDIN;
if ($hole != $end) ' < file.log
Holes would be created whenever the gap would otherwise include at least one full filesystem block (typically 4KiB). There would probably be more NUL bytes on either side of those hole.
What GNU grep
considers non-text varies with the version and the locale.
In first approximation, you can try:
grep -anPe '^((?!.*$)|.*)' < file.log
That is look for the lines that contain a NUL character, 0 byte (likely to be the cause of that Binary file message if your log file was truncated while open for writing by some process without O_APPEND), or non-characters (possible if you're in a locale with a multibyte charset like UTF-8 and some lines were output in another charset).
That assumes your GNU grep
was built with PCRE support (for -P
).
You may want to pipe that output to something like sed -n l
or hexdump -C
or od -vtc -tx1
(and maybe omit the -n
option to grep
) to try and identify those byte sequences that cause the binary message.
Note that grep -a
does not skip those lines, it just tells GNU grep
to not treat files it considers as binary specially. Lines with those 0 bytes or non-characters will still be reported if they match the pattern.
On Linux at least and most native filesystems, you can tell if a file is sparse, that is has unallocated parts (holes) that would appear full of zero bytes with:
perl -le '
seek STDIN,0,4 or die; $hole = tell STDIN;
seek STDIN, $hole, 3 and $data = tell STDIN;
seek STDIN, 0, 2; $end = tell STDIN;
if ($hole != $end) ' < file.log
Holes would be created whenever the gap would otherwise include at least one full filesystem block (typically 4KiB). There would probably be more NUL bytes on either side of those hole.
edited Feb 25 at 10:44
answered Feb 25 at 9:39
Stéphane ChazelasStéphane Chazelas
311k57586945
311k57586945
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f502815%2ffind-out-all-the-line-numbers-with-binary-non-text-chars-of-a-big-log-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown