I have to divide in various subdirectories some thousands of pdf files stored in a unique directory.
The criterium is the presence in the content of the files (not in the filenames) of one or more keywords listed in a csv file with around 200 entries (it is rare that more keywords appears in a singe file, but it can happen; more probably, the most part of the keywords will not appear in ANY file, but the things could change in the future).
So the results, if i call A,B,C,...,AA, AB,...ZZ, the keywords, must be:
- list of files that contains only A;
- list of files that contains only B;
...
- list of files that contains only AA;
- list of files that contains only AB;
...
- list of files that contains only ZZ;
- list of files that contains A and B;
- list of files that contains A and C;
...
- list of filesthat contains A and B and C;
and so on.
The possible combination are almos endless, but in practice, as i said, i don't expect a large number on repetitions.
I'm interested in obtaining a series of file lists thet i can grab and move in the subdirectories named according to the the keywords contained (the creation of the subdirecories can be made manually).
Anyone has an idea if Everything could help me?
Thanks in advance.
Searching in the contents of files for a list of keywords
-
- Posts: 1
- Joined: Mon Nov 01, 2021 12:25 pm
Re: Searching in the contents of files for a list of keywords
It should be possible to generate the required searches and pass them to the command line interface ES.
ES could spit the results out to a file list, eg:
ES.exe "c:\pdf folder\" ext:pdf content:A -export-efu pdf-with-A.efu
ES.exe "c:\pdf folder\" ext:pdf content:A content:B -export-efu pdf-with-A-B.efu
ES.exe "c:\pdf folder\" ext:pdf content:A content:B content:C -export-efu pdf-with-A-B-C.efu
ES could spit the results out to a file list, eg:
ES.exe "c:\pdf folder\" ext:pdf content:A -export-efu pdf-with-A.efu
ES.exe "c:\pdf folder\" ext:pdf content:A content:B -export-efu pdf-with-A-B.efu
ES.exe "c:\pdf folder\" ext:pdf content:A content:B content:C -export-efu pdf-with-A-B-C.efu