awk paragraph does not work

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












I have downloaded the KingBase Lite 2018 Update 3 file from here. I now want to extract data from a single event such as the "FIDE Candidates 2018": I want to get all the paragraphs containing this text and the paragraph below it, so I have the whole pgn for each game.



To first just get the paragraph that contains the text, I followed these recommendations.



However, when I try awk -v RS='' -v ORS='nn' '/FIDE Candidates 2018/' KingBaseLite2018-03.pgn, it just prints the whole file. When I search for a word that does not exist, it does not print anything. So I assume it does the search correctly, but it somehow does not properly cut at new lines. There might be something awkward about the new line characters in that file. When I try other suggestions from the above link like using perl, I get the same result.



What can I do to get the paragraph now? And how can I include one paragraph below as well?







share|improve this question

















  • 2




    Might be due to your use of RS? Anyhow, you should post a sample input and desired output to help those who want to help you reproduce your issue and test the solutions.
    – simlev
    Jun 6 at 10:29






  • 3




    The problem seems to be that the file has Windows-style CRLF line-endings
    – steeldriver
    Jun 6 at 11:35






  • 2




    ... although it's not entirely equivalent, probably setting RS='rnrn' is sufficient in this case
    – steeldriver
    Jun 6 at 11:54















up vote
2
down vote

favorite












I have downloaded the KingBase Lite 2018 Update 3 file from here. I now want to extract data from a single event such as the "FIDE Candidates 2018": I want to get all the paragraphs containing this text and the paragraph below it, so I have the whole pgn for each game.



To first just get the paragraph that contains the text, I followed these recommendations.



However, when I try awk -v RS='' -v ORS='nn' '/FIDE Candidates 2018/' KingBaseLite2018-03.pgn, it just prints the whole file. When I search for a word that does not exist, it does not print anything. So I assume it does the search correctly, but it somehow does not properly cut at new lines. There might be something awkward about the new line characters in that file. When I try other suggestions from the above link like using perl, I get the same result.



What can I do to get the paragraph now? And how can I include one paragraph below as well?







share|improve this question

















  • 2




    Might be due to your use of RS? Anyhow, you should post a sample input and desired output to help those who want to help you reproduce your issue and test the solutions.
    – simlev
    Jun 6 at 10:29






  • 3




    The problem seems to be that the file has Windows-style CRLF line-endings
    – steeldriver
    Jun 6 at 11:35






  • 2




    ... although it's not entirely equivalent, probably setting RS='rnrn' is sufficient in this case
    – steeldriver
    Jun 6 at 11:54













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I have downloaded the KingBase Lite 2018 Update 3 file from here. I now want to extract data from a single event such as the "FIDE Candidates 2018": I want to get all the paragraphs containing this text and the paragraph below it, so I have the whole pgn for each game.



To first just get the paragraph that contains the text, I followed these recommendations.



However, when I try awk -v RS='' -v ORS='nn' '/FIDE Candidates 2018/' KingBaseLite2018-03.pgn, it just prints the whole file. When I search for a word that does not exist, it does not print anything. So I assume it does the search correctly, but it somehow does not properly cut at new lines. There might be something awkward about the new line characters in that file. When I try other suggestions from the above link like using perl, I get the same result.



What can I do to get the paragraph now? And how can I include one paragraph below as well?







share|improve this question













I have downloaded the KingBase Lite 2018 Update 3 file from here. I now want to extract data from a single event such as the "FIDE Candidates 2018": I want to get all the paragraphs containing this text and the paragraph below it, so I have the whole pgn for each game.



To first just get the paragraph that contains the text, I followed these recommendations.



However, when I try awk -v RS='' -v ORS='nn' '/FIDE Candidates 2018/' KingBaseLite2018-03.pgn, it just prints the whole file. When I search for a word that does not exist, it does not print anything. So I assume it does the search correctly, but it somehow does not properly cut at new lines. There might be something awkward about the new line characters in that file. When I try other suggestions from the above link like using perl, I get the same result.



What can I do to get the paragraph now? And how can I include one paragraph below as well?









share|improve this question












share|improve this question




share|improve this question








edited Jun 6 at 12:34
























asked Jun 6 at 10:07









maddingl

386




386







  • 2




    Might be due to your use of RS? Anyhow, you should post a sample input and desired output to help those who want to help you reproduce your issue and test the solutions.
    – simlev
    Jun 6 at 10:29






  • 3




    The problem seems to be that the file has Windows-style CRLF line-endings
    – steeldriver
    Jun 6 at 11:35






  • 2




    ... although it's not entirely equivalent, probably setting RS='rnrn' is sufficient in this case
    – steeldriver
    Jun 6 at 11:54













  • 2




    Might be due to your use of RS? Anyhow, you should post a sample input and desired output to help those who want to help you reproduce your issue and test the solutions.
    – simlev
    Jun 6 at 10:29






  • 3




    The problem seems to be that the file has Windows-style CRLF line-endings
    – steeldriver
    Jun 6 at 11:35






  • 2




    ... although it's not entirely equivalent, probably setting RS='rnrn' is sufficient in this case
    – steeldriver
    Jun 6 at 11:54








2




2




Might be due to your use of RS? Anyhow, you should post a sample input and desired output to help those who want to help you reproduce your issue and test the solutions.
– simlev
Jun 6 at 10:29




Might be due to your use of RS? Anyhow, you should post a sample input and desired output to help those who want to help you reproduce your issue and test the solutions.
– simlev
Jun 6 at 10:29




3




3




The problem seems to be that the file has Windows-style CRLF line-endings
– steeldriver
Jun 6 at 11:35




The problem seems to be that the file has Windows-style CRLF line-endings
– steeldriver
Jun 6 at 11:35




2




2




... although it's not entirely equivalent, probably setting RS='rnrn' is sufficient in this case
– steeldriver
Jun 6 at 11:54





... although it's not entirely equivalent, probably setting RS='rnrn' is sufficient in this case
– steeldriver
Jun 6 at 11:54











1 Answer
1






active

oldest

votes

















up vote
3
down vote



accepted










I downloaded and unzipped the file, and the line endings are CRLF, so you need to account for that, either by using a tool like fromdos, or if you don't want to modify the file, you can to tell Perl that you want it to do the translation with its :crlf PerlIO layer, which is what I'm doing below with the PERLIO environment variable. (There are other ways to change the layers, but this one was easiest for a one-liner.)



I'm using the flip-flop operator ... to extract only the paragraph that matches the regex plus the following one that matches /^1./ (since all the paragraphs in the file start with either [ or 1.).



wget http://kingbase-chess.net/download/650 -O KingBaseLite2018-03.zip
unzip KingBaseLite2018-03.zip
PERLIO=:crlf perl -00ne 'print if /"FIDE Candidates 2018"/.../^1./' KingBaseLite2018-03.pgn





share|improve this answer



















  • 1




    This works perfectly! Thank you very much :)
    – maddingl
    Jun 6 at 12:33










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f448149%2fawk-paragraph-does-not-work%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
3
down vote



accepted










I downloaded and unzipped the file, and the line endings are CRLF, so you need to account for that, either by using a tool like fromdos, or if you don't want to modify the file, you can to tell Perl that you want it to do the translation with its :crlf PerlIO layer, which is what I'm doing below with the PERLIO environment variable. (There are other ways to change the layers, but this one was easiest for a one-liner.)



I'm using the flip-flop operator ... to extract only the paragraph that matches the regex plus the following one that matches /^1./ (since all the paragraphs in the file start with either [ or 1.).



wget http://kingbase-chess.net/download/650 -O KingBaseLite2018-03.zip
unzip KingBaseLite2018-03.zip
PERLIO=:crlf perl -00ne 'print if /"FIDE Candidates 2018"/.../^1./' KingBaseLite2018-03.pgn





share|improve this answer



















  • 1




    This works perfectly! Thank you very much :)
    – maddingl
    Jun 6 at 12:33














up vote
3
down vote



accepted










I downloaded and unzipped the file, and the line endings are CRLF, so you need to account for that, either by using a tool like fromdos, or if you don't want to modify the file, you can to tell Perl that you want it to do the translation with its :crlf PerlIO layer, which is what I'm doing below with the PERLIO environment variable. (There are other ways to change the layers, but this one was easiest for a one-liner.)



I'm using the flip-flop operator ... to extract only the paragraph that matches the regex plus the following one that matches /^1./ (since all the paragraphs in the file start with either [ or 1.).



wget http://kingbase-chess.net/download/650 -O KingBaseLite2018-03.zip
unzip KingBaseLite2018-03.zip
PERLIO=:crlf perl -00ne 'print if /"FIDE Candidates 2018"/.../^1./' KingBaseLite2018-03.pgn





share|improve this answer



















  • 1




    This works perfectly! Thank you very much :)
    – maddingl
    Jun 6 at 12:33












up vote
3
down vote



accepted







up vote
3
down vote



accepted






I downloaded and unzipped the file, and the line endings are CRLF, so you need to account for that, either by using a tool like fromdos, or if you don't want to modify the file, you can to tell Perl that you want it to do the translation with its :crlf PerlIO layer, which is what I'm doing below with the PERLIO environment variable. (There are other ways to change the layers, but this one was easiest for a one-liner.)



I'm using the flip-flop operator ... to extract only the paragraph that matches the regex plus the following one that matches /^1./ (since all the paragraphs in the file start with either [ or 1.).



wget http://kingbase-chess.net/download/650 -O KingBaseLite2018-03.zip
unzip KingBaseLite2018-03.zip
PERLIO=:crlf perl -00ne 'print if /"FIDE Candidates 2018"/.../^1./' KingBaseLite2018-03.pgn





share|improve this answer















I downloaded and unzipped the file, and the line endings are CRLF, so you need to account for that, either by using a tool like fromdos, or if you don't want to modify the file, you can to tell Perl that you want it to do the translation with its :crlf PerlIO layer, which is what I'm doing below with the PERLIO environment variable. (There are other ways to change the layers, but this one was easiest for a one-liner.)



I'm using the flip-flop operator ... to extract only the paragraph that matches the regex plus the following one that matches /^1./ (since all the paragraphs in the file start with either [ or 1.).



wget http://kingbase-chess.net/download/650 -O KingBaseLite2018-03.zip
unzip KingBaseLite2018-03.zip
PERLIO=:crlf perl -00ne 'print if /"FIDE Candidates 2018"/.../^1./' KingBaseLite2018-03.pgn






share|improve this answer















share|improve this answer



share|improve this answer








edited Jun 6 at 11:45


























answered Jun 6 at 11:38









haukex

2839




2839







  • 1




    This works perfectly! Thank you very much :)
    – maddingl
    Jun 6 at 12:33












  • 1




    This works perfectly! Thank you very much :)
    – maddingl
    Jun 6 at 12:33







1




1




This works perfectly! Thank you very much :)
– maddingl
Jun 6 at 12:33




This works perfectly! Thank you very much :)
– maddingl
Jun 6 at 12:33












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f448149%2fawk-paragraph-does-not-work%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?