# --- print second column
ls -al | awk '{print $2}'
ls -al | pru 'split(" ")[1]'
ls -al | scc -n 'F(1)'
# --- count and average of all integers on second position
ls -al | awk '{ s += $2; } END {print "average" ,int(s/NR);print "count ",int(NR)}'
ls -al | pru 'split(" ")[1]' '"average #{mean(&:to_i)}\ncount #{size}"'
ls -al | scc 'int c=0; WRL c+=F(1); FMT("average %s\ncount %s") %(c/NR) %NR'
# --- count lines
ls -al | wc -l
ls -al | pru -r 'size'
ls -al | scc 'WRL;NR+1'
# -- replace a 5 with five
ls -al | sed 's/5/five/'
ls -al | pru 'gsub(/5/,"five")'
ls -al | scc -n 'RR(line,R("5"),"five")'
# every second line
ls -al | pru 'i % 2 == 0'
ls -al | scc -n 'NR % 2 ? line : ""'
# sum up df's used-space column
df | awk '{n+=$3;}; END{print n}'
df | pru ?????
df | scc 'int n=0; WRL n+=F(2); n'
> Note that the worst case of complexity for this algorithm is much, much worse than the worst case complexity for Boyer Moore.
Can you explain your thought process for why worst case complexity is worse than other? Did you do any measurements? I believe you are incorrect. I am author of this page. I've just did quick test for SS = "aaa ...aaaBaaa...aaa", with SS_size=240. My algorithm is faster than 3 BMs out of 4 tested.
> Do not use this algorithm carelessly. For example, if you use it in a thoughtless way in your web server, you may open yourself to a DoS attack.
Same can be said about all BM, naive and BSD's memmem/strstr. Possibility of DOS for substring search algorithm was known for a long time - but it never materialized. The cure is trivial - limit substring size.
> That immediately made me think he might have devised it for genomics research.
No, it wasn't. It was devised when I was preparing for Google interview (which I failed). And of cause algorithms which pre-index haystack will always be faster. I myself do not consider haystack pre-indexing algorithms a "text search" algorithms (maybe incorrectly).
> gross exaggeration
If you count pre-indexing algorithms and edge cases - then you are correct. For most common case - can you show me something faster (from a student or even yourself)?
> Can you explain your thought process for why worst case complexity is worse than other? Did you do any measurements? I believe you are incorrect. I am author of this page. I've just did quick test for SS = "aaa ...aaaBaaa...aaa", with SS_size=240. My algorithm is faster than 3 BMs out of 4 tested.
You write on the page that your algorithm has "O(NM) worst case complexity." Compare Boyer-Moore's time complexity. The grandfather was interested in asymptotics. Not in some examples.
> Same can be said about all BM, naive and BSD's memmem/strstr.
Red herring.
> Possibility of DOS for substring search algorithm was known for a long time - but it never materialized.
What are you talking about?
> If you count pre-indexing algorithms and edge cases - then you are correct. For most common case - can you show me something faster (from a student or even yourself)?
What do you mean by egde cases? All the cases that make your program run slow?
Using your terminology, KMP has complexity O(N + M). That's better than O(MN).
Another happy customer of prgmr. I am 4yr with them. You kind of forget about them because there was not a single issues or outage during all this time.