How to extract many .doc text + tabular elements into CSV by any Unix tool?
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
This thread is considering the part (1) of the thread How to split Excel table into CSV files in .doc by Bold text?
You have 777 .doc files where each .doc file contains a big Excel table.
The following working work process works correctly, which allows me to work with the data, since you can convert Spreadsheet data into CSV files and databases.
However, I want to automate this step to evaluate better the export process from .doc file into data file, and since there are too many files to do it for all those files.
I cannot bulk-study the contents of all those files so I am thinking a scripting approach to iterate through all .doc files, but not sure if any interface and/or scripting tool exists for such a task.
- Doing CTRL+A
- and copy-pasting the content into any spreadsheet editor (I used WPS editor)
Source files: .doc files containing some text and Excel tables
Target file: Excel file, and/or anything similar, etc WPS editor file, LibreOffice file, ...
- one Excel file can be sufficient for all .doc fies because each .doc file has a top line that can be used as a heading and separater in later categorising the content
OS: Linux Debian Stretch 9 and others
Data: example .odt file here
debian csv microsoft-word
add a comment |Â
up vote
0
down vote
favorite
This thread is considering the part (1) of the thread How to split Excel table into CSV files in .doc by Bold text?
You have 777 .doc files where each .doc file contains a big Excel table.
The following working work process works correctly, which allows me to work with the data, since you can convert Spreadsheet data into CSV files and databases.
However, I want to automate this step to evaluate better the export process from .doc file into data file, and since there are too many files to do it for all those files.
I cannot bulk-study the contents of all those files so I am thinking a scripting approach to iterate through all .doc files, but not sure if any interface and/or scripting tool exists for such a task.
- Doing CTRL+A
- and copy-pasting the content into any spreadsheet editor (I used WPS editor)
Source files: .doc files containing some text and Excel tables
Target file: Excel file, and/or anything similar, etc WPS editor file, LibreOffice file, ...
- one Excel file can be sufficient for all .doc fies because each .doc file has a top line that can be used as a heading and separater in later categorising the content
OS: Linux Debian Stretch 9 and others
Data: example .odt file here
debian csv microsoft-word
@DopeGhoti Generally, I cannot do the iteration part: doing it for all at once OR looping one by one.
â Léo Léopold Hertz ì¤ÂìÂÂ
Oct 27 '17 at 18:10
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
This thread is considering the part (1) of the thread How to split Excel table into CSV files in .doc by Bold text?
You have 777 .doc files where each .doc file contains a big Excel table.
The following working work process works correctly, which allows me to work with the data, since you can convert Spreadsheet data into CSV files and databases.
However, I want to automate this step to evaluate better the export process from .doc file into data file, and since there are too many files to do it for all those files.
I cannot bulk-study the contents of all those files so I am thinking a scripting approach to iterate through all .doc files, but not sure if any interface and/or scripting tool exists for such a task.
- Doing CTRL+A
- and copy-pasting the content into any spreadsheet editor (I used WPS editor)
Source files: .doc files containing some text and Excel tables
Target file: Excel file, and/or anything similar, etc WPS editor file, LibreOffice file, ...
- one Excel file can be sufficient for all .doc fies because each .doc file has a top line that can be used as a heading and separater in later categorising the content
OS: Linux Debian Stretch 9 and others
Data: example .odt file here
debian csv microsoft-word
This thread is considering the part (1) of the thread How to split Excel table into CSV files in .doc by Bold text?
You have 777 .doc files where each .doc file contains a big Excel table.
The following working work process works correctly, which allows me to work with the data, since you can convert Spreadsheet data into CSV files and databases.
However, I want to automate this step to evaluate better the export process from .doc file into data file, and since there are too many files to do it for all those files.
I cannot bulk-study the contents of all those files so I am thinking a scripting approach to iterate through all .doc files, but not sure if any interface and/or scripting tool exists for such a task.
- Doing CTRL+A
- and copy-pasting the content into any spreadsheet editor (I used WPS editor)
Source files: .doc files containing some text and Excel tables
Target file: Excel file, and/or anything similar, etc WPS editor file, LibreOffice file, ...
- one Excel file can be sufficient for all .doc fies because each .doc file has a top line that can be used as a heading and separater in later categorising the content
OS: Linux Debian Stretch 9 and others
Data: example .odt file here
debian csv microsoft-word
asked Oct 27 '17 at 17:40
Léo Léopold Hertz ì¤ÂìÂÂ
9081041102
9081041102
@DopeGhoti Generally, I cannot do the iteration part: doing it for all at once OR looping one by one.
â Léo Léopold Hertz ì¤ÂìÂÂ
Oct 27 '17 at 18:10
add a comment |Â
@DopeGhoti Generally, I cannot do the iteration part: doing it for all at once OR looping one by one.
â Léo Léopold Hertz ì¤ÂìÂÂ
Oct 27 '17 at 18:10
@DopeGhoti Generally, I cannot do the iteration part: doing it for all at once OR looping one by one.
â Léo Léopold Hertz ì¤ÂìÂÂ
Oct 27 '17 at 18:10
@DopeGhoti Generally, I cannot do the iteration part: doing it for all at once OR looping one by one.
â Léo Léopold Hertz ì¤ÂìÂÂ
Oct 27 '17 at 18:10
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f400932%2fhow-to-extract-many-doc-text-tabular-elements-into-csv-by-any-unix-tool%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
@DopeGhoti Generally, I cannot do the iteration part: doing it for all at once OR looping one by one.
â Léo Léopold Hertz ì¤ÂìÂÂ
Oct 27 '17 at 18:10