Source term collector
Gijos autorius: Hans Lenting
Hans Lenting
Hans Lenting
Nyderlandai
Narys (2006)
iš vokiečių į olandų
Jul 6, 2024

Many CAT tools provide functions to list the frequent source terms of a project. This process usually produces a lot of garbage. Is there a program that only looks at the left and right of frequent nouns and then lists groups of two or three words?

 
Hans Lenting
Hans Lenting
Nyderlandai
Narys (2006)
iš vokiečių į olandų
TEMOS KŪRĖJA(S)
Source fragment harvester Jul 7, 2024

I should have chosen "Source fragment harvester" as the subject.

Since there have been no replies to my post, I'd like to post an idea I've had since I posted it:

Use a regular expression to extract the candidates.

Sort in Excel and delete the noise.

Screenshot 2024-07-07 at 14.01.15

Screenshot 2024-07-07 at 14.00.59

[Bijgewerkt op 2024-07-07 12:20 GMT]


 
Hans Lenting
Hans Lenting
Nyderlandai
Narys (2006)
iš vokiečių į olandų
TEMOS KŪRĖJA(S)
Got this suggestion Jul 8, 2024

A kind person gave me this suggestion:

sed -E "s/( a| all| allows| are| at| in| for| of| to| with| on| by| or| of| the| and| is| at)$//"


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Source term collector







Pastey
Your smart companion app

Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.

Find out more »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »