![]() ![]() However, if the interval size is relatively manageable, and is likely to occur often, then this is the solution you should choose. In other words, even if the input file is very large, if the actual interval occurrence is still very infrequent then his solution is probably the way to go. This solution will slow with larger interval sizes, whereas don's will slow with larger interval frequencies. It works by building a look-ahead buffer of $B-count lines before ever attempting to print anything.Īnd actually, probably I should clarify my previous point: the primary performance limiter for both this solution and don's will be directly related to interval. This is an example of what is called a sliding window on input. <(grep PATTERN -A1 -B2 <(nl -ba -nrz -s: infile) | sort) | cut -d: -f2-ĭon's might be better in most cases, but just in case the file is really big, and you can't get sed to handle a script file that large (which can happen at around 5000+ lines of script), here it is with plain sed: sed -ne:t -e"/\n.*$match/D" \ With join: join -t: -j1 -v1 <(nl -ba -nrz -s: infile | sort) \ <(nl -ba -nrz -s: infile | sort) | cut -d: -f2-Ĭomm requires sorted input which means the line order would not be preserved in the final output (unless your file is already sorted) so nl is used to number the lines before sorting, comm -13 prints only lines unique to 2nd FILE and then cut removes the part that was added by nl (that is, the first field and the delimiter :) With comm: comm -13 <(grep PATTERN -A1 -B2 <(nl -ba -nrz -s: infile) | sort) \ ![]() Other ways that don't preserve line order and are most likely slower: though if the input has only a few matches it's not worth doing it. I think this could be slightly optimized if it collapsed any three or more consecutive line numbers into ranges so as to have e.g. This should also work with files of patterns passed to grep via -f e.g.: grep -n -A1 -B2 -f patterns infile | \ By following the steps outlined in this article, you can use grep to ignore case in your searches and take advantage of its powerful pattern-matching capabilities.You could use gnu grep with -A and -B to print exactly the parts of the file you want to exclude but add the -n switch to also print the line numbers and then format the output and pass it as a command script to sed to delete those lines: grep -n -A1 -B2 PATTERN infile | \ This can be particularly useful when working with files that may contain text in different cases or when searching for text that may be typed in different cases. Conclusionīy using the -i flag with the grep command, you can search for text without regard to case sensitivity. You can combine these options with the -i flag to further refine your searches.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |