lvvlvv's comments

lvvlvv · on April 27, 2011

Simplified C++ inspired by AWK: http://github.com/lvv/scc

lvvlvv · on April 22, 2011

Similar project for C++: http://github.com/lvv/scc

lvvlvv · on April 22, 2011

  # --- print second column 
  ls -al | awk '{print $2}'
  ls -al | pru 'split(" ")[1]'
  ls -al | scc -n 'F(1)'

  # --- count and average of all integers on second position
  ls -al | awk '{ s += $2; } END {print "average" ,int(s/NR);print "count ",int(NR)}'
  ls -al | pru 'split(" ")[1]' '"average #{mean(&:to_i)}\ncount #{size}"'
  ls -al | scc 'int c=0; WRL c+=F(1); FMT("average %s\ncount %s") %(c/NR) %NR'

  # --- count lines
  ls -al | wc -l  
  ls -al | pru -r 'size'
  ls -al | scc 'WRL;NR+1'

  # -- replace a 5 with five
  ls -al | sed 's/5/five/'
  ls -al | pru 'gsub(/5/,"five")'
  ls -al | scc -n 'RR(line,R("5"),"five")'

  # every second line
  ls -al | pru 'i % 2 == 0'
  ls -al | scc -n 'NR % 2 ? line : ""'

  # sum up df's used-space column
  df | awk '{n+=$3;};  END{print n}'
  df | pru  ?????
  df | scc 'int n=0; WRL n+=F(2); n'

lvvlvv · on April 13, 2011

Yet another gcc wrapper: http://github.com/lvv/scc

lvvlvv · on March 20, 2011

I like them too. Cheap and simple.

lvvlvv · on Feb 12, 2011

I am the author. You are right. I've updated the page.

lvvlvv · on Feb 12, 2011

> Note that the worst case of complexity for this algorithm is much, much worse than the worst case complexity for Boyer Moore.

Can you explain your thought process for why worst case complexity is worse than other? Did you do any measurements? I believe you are incorrect. I am author of this page. I've just did quick test for SS = "aaa ...aaaBaaa...aaa", with SS_size=240. My algorithm is faster than 3 BMs out of 4 tested.

> Do not use this algorithm carelessly. For example, if you use it in a thoughtless way in your web server, you may open yourself to a DoS attack.

Same can be said about all BM, naive and BSD's memmem/strstr. Possibility of DOS for substring search algorithm was known for a long time - but it never materialized. The cure is trivial - limit substring size.

> That immediately made me think he might have devised it for genomics research.

No, it wasn't. It was devised when I was preparing for Google interview (which I failed). And of cause algorithms which pre-index haystack will always be faster. I myself do not consider haystack pre-indexing algorithms a "text search" algorithms (maybe incorrectly).

> gross exaggeration

If you count pre-indexing algorithms and edge cases - then you are correct. For most common case - can you show me something faster (from a student or even yourself)?

eru · on Feb 13, 2011

> Can you explain your thought process for why worst case complexity is worse than other? Did you do any measurements? I believe you are incorrect. I am author of this page. I've just did quick test for SS = "aaa ...aaaBaaa...aaa", with SS_size=240. My algorithm is faster than 3 BMs out of 4 tested.

You write on the page that your algorithm has "O(NM) worst case complexity." Compare Boyer-Moore's time complexity. The grandfather was interested in asymptotics. Not in some examples.

> Same can be said about all BM, naive and BSD's memmem/strstr.

Red herring.

> Possibility of DOS for substring search algorithm was known for a long time - but it never materialized.

What are you talking about?

> If you count pre-indexing algorithms and edge cases - then you are correct. For most common case - can you show me something faster (from a student or even yourself)?

What do you mean by egde cases? All the cases that make your program run slow?

Using your terminology, KMP has complexity O(N + M). That's better than O(MN).

lvvlvv · on June 27, 2010

You can make static-PNG charts with http://matplotlib.sourceforge.net or http://volnitsky.com/project/mplw

lvvlvv · on June 15, 2010

Another happy customer of prgmr. I am 4yr with them. You kind of forget about them because there was not a single issues or outage during all this time.

lvvlvv · on March 16, 2010

"you ignored us, but the FTC said we were right, nah nah nah"

That how I read their open letter. And my interpretation of "you should have worked with us" is as "you should have hired us as consultants"