Promises and monads are way different though. Promises are used for "simplifying" asynchronous coding in imperative languages, and monads are a general computational methodology towards writing syntactically imperative-looking code in purely functional languages, built on top of type-classes. Without understanding why Haskell is purely functional, you couldn't even understand the motivation behind monads. And sure, you can extend a javascript prototype (or any object) to do something monad like, but without all of Haskell's strictness, you'd miss the jest.
> monads are a general computational methodology towards writing syntactically imperative-looking code in purely functional languages
I think you are conflating "monadic do-syntax" and larger concept of monad (in Haskell and elsewhere).
Many constructs in Haskell (moands/comonads/applicatives/free/cofree etc.) are just it, constructs (drawn from theory), which happen to have interesting properties.
And people keep finding interesting applications for them. Some of those applications are tested with time and hence are more familiar (f.x. effect ordering with monads).
My gut thinks this sort of training alternative to back propagation has a lot of uses where SVM have no applicability. The article talks a lot about RNNs (neural nets for sequence prediction), but I would guess it would have uses in online learning as well. Learning twice as fast in those situations seems pretty significant to me.
I can't believe I'm getting down-voted just because I'm not bullish on ANN.
As I said, in my humble opinion (IMHO), (and educated opinion) NN don't really have a lot of practical use. So long as they have to be processed in parallel, SVM will always have the advantage that they can be computed sequentially, meaning they can process much faster and without the need for specialized hardware. SVMs and ANN are solving the same problem in machine learning, they're both methods used for classifying data. Just SVMs do it much faster and within more practical means.
The classical solver used to train kernel SVMs and implemented in libsvm [1] has a time complexity in between o(n^2) and o(n^3) where n is the number of labeled samples in the training set. In practice it becomes intractable to train a non-linear kernel SVM as soon as the training set is larger than a few tens of thousands of labeled samples.
Deep Neural Networks trained with variants of Stochastic Gradient Descent on the other hand have no problem scaling to training sets with millions of labeled samples which makes them suitable to solve large industrial-scale problems (e.g. speech recognition in mobile phones or computer vision to help moderate photos that are posted on social networks).
SVMs can be useful in the small training sets regime (less than 10 000 training examples). But for that class of problems, it's also perfectly reasonable to use a single CPU (with 2 or 4 cores) with a good linear algebra library such as OpenBLAS or MKL to train an equally powerful fully connected neural network with 1 or 2 hidden layers. Hyper-parameter tuning for SVMs can be easier for beginners using SVMs with the default kernels (e.g. RBF or polynomial) but with modern optimizers like Adam implemented in well designed and well documented high-level libraries like Keras it has become very easy to train neural networks that just work.
Also for many small to medium scales problems that are not signal-style problems [2], Random Forests and Gradient Boosted Trees tend to perform better than SVMs. Most Kaggle competitions are won with either linear models (e.g. logistic regression), Gradient Boosting, neural networks or a mix of those. Very few competitors have used kernel-based SVMs in a winning entry AFAIK.
> I can't believe I'm getting down-voted just because I'm not bullish on ANN.
probably it is happening because ANN's definitely do have some advantage over SVM's for modelling real world phenomenon.
specifically, ANN's are _parametric_, while SVM's are nonparametric in the sense that for an ANN, you have a bunch of hidden layers (of varying sizes) depending on the number of features, and a bias parameter. this is your model.
SVM's otoh (at least in the kernelized case), consist of set of support-vectors selected from a training set. which in worst case can be as large as the training set.
modelling real world phenomenon, for example optimal air-conditioning based on a large number of external inputs in a data center, are far more conducive in ANN's than SVM's. ANN's are afterall universal approximators. with SVM's you have to guess the kernel...
I can't believe I'm getting down-voted just because I'm not bullish on ANN.
I don't think it's that - I think it is because you are factually wrong. In particular this part: "ANN are not that big of a deal, IMHO, when you compare them to other machine learning techniques i.e. Support Vector Machines. " is wrong.
(Deep) Neural Networks are a very, very big deal because they work so much better than SVNs in every domain where there is sufficient training data, sufficient time (and enough GPUs!) to train them.
This post is a big deal because it shows a way to cut down that training time.
I believe SVM>ANN might be true if the classification task is relatively simple. Most of the state-of-the-art computer vision algorithms are based on ANNs, though (just as an example). Do you think SVMs will catch up in those areas?
(I upvoted you even though I don't agree with you, just thought getting greyed out was excessive)
As soon as I read the article too I wanted to check if the NYT is owned by private equity. According to wikipedia, it's both part publicly traded on the NYSE and privately owned: https://en.wikipedia.org/wiki/The_New_York_Times_Company. I wouldn't know where to find out how much of the company is privately owned and publicly traded, though. But spot on, it's a publicity piece (and you've played your part by clicking). But I think most would see that by the presentation. Was still entertaining though.
Being privately owned isn't the same thing as being owned by private equity. And looking at the NYT it has Class A and Class B shares, where only the Class B shares are privately owned. The usual reason for dividing the shares into different classes is to give the owners of one class rights that the other class doesn't have. I'm not sure what, if any, differences there are between the NYT's Class A and Class B shares.
I'll admit, I don't know what the specific legal definition of private equity is. According to wikipedia: https://en.wikipedia.org/wiki/Private_equity "In finance, private equity is an asset class consisting of equity securities and debt in operating companies that are not publicly traded on a stock exchange." There are probably more legal specifics that would further divide private equity from just anything that's not publicly traded. But, you're on the right track, if in fact the private Class B shares override the rights of Class A shares (significantly), then the NYT's could be seen as just another shady entity in our lives that's controlled by private equity, as the link tries to illuminate.
What irony? There is no connection. What is so "ironic" about a private business being private? Or is there anyone - apart from former Eastern Bloc countries - who thinks newspapers should be state-owned? The topic is privatization of government functions. Private media should not write about it? I don't understand. So in order to write about this topic we need non-private media, i.e. only individual bloggers, state-owned media (China?) and maybe NPR can write about it?
It is visually appealing and I like its attempt to bringing exposure to something general public has little understanding of. This topic definitely deserves more than just a publicity piece, especially for something as complicated as financial industry.
I mentioned it in the comments of Edward Yang's article but I'll mention it here again, Template Haskell is based on C++ templates, here's the original paper http://research.microsoft.com/en-us/um/people/simonpj/papers....
Haskell was written to mimic C++ more than lisp, at least when it comes to meta programming. And if you read the paper having the compiler check the template code before the expansion is there by design decision, that's why it's compile time only.
> Haskell was written to mimic C++ more than lisp, at least when it comes to meta programming.
Depends what you mean by "metaprogramming". Most people would consider type class "abuse" metaprogramming, but it's definitely not something inspired by C++.
The wikipedia definition is good: "Metaprogramming is the writing of computer programs with the ability to treat programs as their data. It means that a program could be designed to read, generate, analyse or transform other programs, and even modify itself while running."
In my own words, code that can read and write code (whether that's at compile time or run time).
Right, so then Haskell type classes qualify, and my original point stands. Certainly the type class language is limited, but common extensions make it Turing complete [1].
It's kind of cruel to come later downvote an early commenter, who helped rescue a good article from oblivion and put their name to a quick recommendation.
I agree, things are probably too good to be true right now. I really don't think it's fair to blame engineers though. My only advice to people working in this industry is to save your money -- be sure to be keeping your money in an emergency fund of some kind incase something happens. That's my best practical advice.