Trying to use > as a segment delimiter in memoQ
Thread poster: ALAN LAMBSON
ALAN LAMBSON
ALAN LAMBSON
United States
Local time: 07:36
Member (2021)
Spanish to English
+ ...
Jun 23, 2023

I must re-translate a JSON version of a file that was previously translated in InDesign. The JSON content is imported straightforwardly, but now has many insertions with angle brackets such as and and other, more complex but always ending in >, that prevent many segments from getting a good match with what I have in the legacy TM. For instance, I sometimes have several sentences "glued together" with these sequences. Each individual sentence has a match in the TM, but the new segment with all... See more
I must re-translate a JSON version of a file that was previously translated in InDesign. The JSON content is imported straightforwardly, but now has many insertions with angle brackets such as and and other, more complex but always ending in >, that prevent many segments from getting a good match with what I have in the legacy TM. For instance, I sometimes have several sentences "glued together" with these sequences. Each individual sentence has a match in the TM, but the new segment with all of these concatenated no longer matches.

All I want to do is segment the JSON text on input so that it uses ">" as a segment-end delimiter. In theory, this should break out most or all of these sequences as separate segments, which can be simply copied from source and ignored. Exporting back to JSON should put it all back together again.

At least, this is my theory. But when I add the ">" character to the list of segment-end delimiters in segmentation rule set, apply to the project, then import the JSON file, it doesn't work.

Any help from an experienced segmentation-rule-writer would be much appreciated!
Collapse


 
Tomás Cano Binder, BA, CT
Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 15:36
Member (2005)
English to Spanish
+ ...
Segmentation works... but needs a space Jun 23, 2023

You will see that adding ">" to a custom set of segmentation rules for your project does work... only if there is a space after the ">" . What they do not tell you in memoQ's help is that you do not only need one of the sentence-ending symbols, but also a space after it.

Is there any chance for you to replace ">" with "> " temporarily before you import the JSON file, translate that, and then remove the space with find/replace?

This would be the simplest approach I think
... See more
You will see that adding ">" to a custom set of segmentation rules for your project does work... only if there is a space after the ">" . What they do not tell you in memoQ's help is that you do not only need one of the sentence-ending symbols, but also a space after it.

Is there any chance for you to replace ">" with "> " temporarily before you import the JSON file, translate that, and then remove the space with find/replace?

This would be the simplest approach I think.

[Edited at 2023-06-23 06:14 GMT]
Collapse


 
ALAN LAMBSON
ALAN LAMBSON
United States
Local time: 07:36
Member (2021)
Spanish to English
+ ...
TOPIC STARTER
Still doesn't work Jun 23, 2023

Hi Tomás,

After inserting a space after > in the source JSON, the segmentation still does not break segments after "> ". I then thought that perhaps memoQ is handling and in a special way, perhaps translating them into a carriage return/line feed. So I changed all > to @, after verifying that the @ character does not appear in anywhere in the file. I then added @ to the list of segment ending characters. It still does not break after "@ ". Could I be doing something else wrong?


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 16:36
English to Russian
Example Jun 23, 2023

Can you give an example of your text for translation? 3 to 4 sentences.

 
Tomás Cano Binder, BA, CT
Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 15:36
Member (2005)
English to Spanish
+ ...
HTML embedded inside the JSON Jun 23, 2023

Alan was so nice as to share a sample with me privately and it was quickly apparent that the client had embedded HTML code inside the JSON file. After all, the solution was to import with a cascading filter (JSON + HTML filter) after a little bit of tweaking some tags that were not HTML standard.

Stepan Konev
 
ALAN LAMBSON
ALAN LAMBSON
United States
Local time: 07:36
Member (2021)
Spanish to English
+ ...
TOPIC STARTER
Tomás saves the day! Jun 23, 2023

Thank you, Tomás, for your excellent and timely help! Thank you Stepan for offering to help, but I'll consider this posting closed now.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Trying to use > as a segment delimiter in memoQ






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »