Post History
TASK I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one of...
#5: Post edited
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- 1. iterate through all `chapter-*.xhtml` files in a directory,
- 1. extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),
- 1. **run my "external" `titlecase` filter on that string**,
- 1. replace the new string for the original one in the source file,
- 1. for all those files. :)
- (The step in bold is the one that is my biggest stumbling block.)
**UPDATE**: Note that for my titlecase filter, ONLY the string between the tags) can be used, so that step #2 (extracting the string) is **mandatory**. Both the answers so far look very promising, but is it possible to do something like e.g. a regex on `sed -n '12p'` in one answer?- The other answer suggests using [`pup`](https://github.com/ericchiang/pup/blob/master/README.md) although it would be helpful not to need extra packages if a simple regex would do.
- **UPDATE 2**: for "real" data, one could download the ZIP of [this commit in a Github repo](https://github.com/standardebooks/r-d-blackmore_lorna-doone/tree/cde77dba9b8e85536fd262c3e44ecee82b6c3ded) - the files in question are found at: `/src/epub/text/chapter-*.xhtml` = the 12th line of every "chapter-nn.xhtml" file.
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- 1. iterate through all `chapter-*.xhtml` files in a directory,
- 1. extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),
- 1. **run my "external" `titlecase` filter on that string**,
- 1. replace the new string for the original one in the source file,
- 1. for all those files. :)
- (The step in bold is the one that is my biggest stumbling block.)
- **UPDATE**: Note that for my titlecase filter, ONLY the string between the tags can be used, so that step #2 (*extracting* the string) is **mandatory**. Both the answers so far look very promising, but is it possible to do something like e.g. a regex on `sed -n '12p'` in one answer?
- The other answer suggests using [`pup`](https://github.com/ericchiang/pup/blob/master/README.md) although it would be helpful not to need extra packages if a simple regex would do.
- **UPDATE 2**: for "real" data, one could download the ZIP of [this commit in a Github repo](https://github.com/standardebooks/r-d-blackmore_lorna-doone/tree/cde77dba9b8e85536fd262c3e44ecee82b6c3ded) - the files in question are found at: `/src/epub/text/chapter-*.xhtml` = the 12th line of every "chapter-nn.xhtml" file.
#4: Post edited
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- 1. iterate through all `chapter-*.xhtml` files in a directory,
- 1. extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),
- 1. **run my "external" `titlecase` filter on that string**,
- 1. replace the new string for the original one in the source file,
- 1. for all those files. :)
- (The step in bold is the one that is my biggest stumbling block.)
- **UPDATE**: Note that for my titlecase filter, ONLY the string between the tags) can be used, so that step #2 (extracting the string) is **mandatory**. Both the answers so far look very promising, but is it possible to do something like e.g. a regex on `sed -n '12p'` in one answer?
The other answer suggests using [`pup`](https://github.com/ericchiang/pup/blob/master/README.md) although it would be helpful not to need extra packages if a simple regex would do.
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- 1. iterate through all `chapter-*.xhtml` files in a directory,
- 1. extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),
- 1. **run my "external" `titlecase` filter on that string**,
- 1. replace the new string for the original one in the source file,
- 1. for all those files. :)
- (The step in bold is the one that is my biggest stumbling block.)
- **UPDATE**: Note that for my titlecase filter, ONLY the string between the tags) can be used, so that step #2 (extracting the string) is **mandatory**. Both the answers so far look very promising, but is it possible to do something like e.g. a regex on `sed -n '12p'` in one answer?
- The other answer suggests using [`pup`](https://github.com/ericchiang/pup/blob/master/README.md) although it would be helpful not to need extra packages if a simple regex would do.
- **UPDATE 2**: for "real" data, one could download the ZIP of [this commit in a Github repo](https://github.com/standardebooks/r-d-blackmore_lorna-doone/tree/cde77dba9b8e85536fd262c3e44ecee82b6c3ded) - the files in question are found at: `/src/epub/text/chapter-*.xhtml` = the 12th line of every "chapter-nn.xhtml" file.
#3: Post edited
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- iterate through all `chapter-*.xhtml` files in a directory,- extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),- **run my "external" `titlecase` filter on that string**,- replace the new string for the original one in the source file,- for all those files. :)- (The step in bold is the one that is my biggest stumbling block.)
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- 1. iterate through all `chapter-*.xhtml` files in a directory,
- 1. extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),
- 1. **run my "external" `titlecase` filter on that string**,
- 1. replace the new string for the original one in the source file,
- 1. for all those files. :)
- (The step in bold is the one that is my biggest stumbling block.)
- **UPDATE**: Note that for my titlecase filter, ONLY the string between the tags) can be used, so that step #2 (extracting the string) is **mandatory**. Both the answers so far look very promising, but is it possible to do something like e.g. a regex on `sed -n '12p'` in one answer?
- The other answer suggests using [`pup`](https://github.com/ericchiang/pup/blob/master/README.md) although it would be helpful not to need extra packages if a simple regex would do.
#2: Post edited
How to extract string from file, run filter, and replace in file with new value?
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- - iterate through all `chapter-*.xhtml` files in a directory,
- - extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),
- - **run my "external" `titlecase` filter on that string**,
- - replace the new string for the original one in the source file,
- - for all those files. :)
- (The step in bold is the one that is my biggest stumbling block.)
I would be grateful for any help with this! If I've omitted any relevant information, please say and I'll remedy that a.s.a.p.David / Fife, UK
- **TASK**
- I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:
-    `<p>HERE IS MY TITLE</p>`
- Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with
-    `<p>Here Is My Title</p>`
- (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one.
- If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious.
- **QUESTION**
- Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to:
- - iterate through all `chapter-*.xhtml` files in a directory,
- - extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags),
- - **run my "external" `titlecase` filter on that string**,
- - replace the new string for the original one in the source file,
- - for all those files. :)
- (The step in bold is the one that is my biggest stumbling block.)
#1: Initial revision
How to extract string from file, run filter, and replace in file with new value?
**TASK** I am coding up ebooks to a specific standard, and have a script that converts a string into the correct titlecase for this publisher. When working with some public domain source files, one often gets this for a chapter title string:    `<p>HERE IS MY TITLE</p>` Using VSCodium (FOSS VS Code alternative), I can open each file, select the string between the `p` tags, then run the `titlecase` script with a hotkey that I've assigned it to. I end up with    `<p>Here Is My Title</p>` (VSCodium's native titlecase filter isn't up to this job.) I save the file, and go on to the next one. If you only have a few of these to do, that's fine. But sometimes there can be dozens, and it gets very tedious. **QUESTION** Is there a way that I can script this? I have scratched my head over both `awk` and `sed`, thinking that these are my prime options. But (as a rank amateur) I cannot work out how to: - iterate through all `chapter-*.xhtml` files in a directory, - extract my string (ALWAYS line 12 in the file, only string on line, between `<p>...</p>` tags), - **run my "external" `titlecase` filter on that string**, - replace the new string for the original one in the source file, - for all those files. :) (The step in bold is the one that is my biggest stumbling block.) I would be grateful for any help with this! If I've omitted any relevant information, please say and I'll remedy that a.s.a.p. David / Fife, UK