Given your example, you ignore the n^2 because n^12 grows SO much faster. For any "normal" amount of operations, n^12 wipes out 4*n^2 in terms of cost.
Certainly there are cases where the "insignificant" parts of an algorithm are not so, correct? In a simple case n^12 dominates a few n^2 sections, but there must be exceptions to that.
Are there cases where a more complete analysis is done -- like comparing two similar algorithms? Or at that point are you actually implementing the algorithms to test and measure?
Yes, there are cases where the "insignificant" part is so significant it's actually more efficient to use asymptotically slower algorithm for everyday problem sizes. The Big O notation is not so much about running speed but about scaling.
While the "insignificant" part would usually have to be really large to influence the running time, it's often the case that one algorithm is faster than another due to the second one having a large constant multiplier, even though the Big O notation would suggest otherwise (e.g. O(10nlogn) would be faster than O(1000*n))
It's also common that even if one of the algorithms is slower in the worst case scenario, its average case performance is better than asymptotically better solutions. A popular example of such might be the quicksort algorithm, which has the worst-case running time O(n^2) but in practice is usually faster than most O(nlogn) sorting algorithms.
"Yes, there are cases where the 'insignificant' part is so significant it's actually more efficient to use asymptotically slower algorithm for everyday problem sizes. The Big O notation is not so much about running speed but about scaling."
I do understand that. My point was that the analysis is likely too optimistic in some cases. Maybe that's not an issue. I guess the whole thing is fairly arbitrary anyway, and conservative, since we're going to use "worst case" wherever it can be applied.
It's not about optimisim, but about eventual dominance of functions.
If you have n^7 and 5(n^6 + 7n^2), then for every n >= 6, the first is larger. In general, for every pair of polynomials f and g of max degrees s and t respectively, and f will eventually dominate g if and only if s > t. Constant multiples are taken into account in big-O, but in a sense they can only delay the inevitable - for any nonzero constant multiples you pick to multiply f and g by, there always will exist a point n such that after that point, f dominates g forever.
It might be that that n, however, is too large for your purposes - perhaps that's what you mean by optimism. Sedgewick has written about other types of analysis that take this into account, see Algorithms for the Masses: http://www.cs.princeton.edu/~rs/talks/AlgsMasses.pdf .
I think you're getting tripped up on what the analysis is. It's a worst-case analysis. That is, at every step of the way, you're figuring out what is the worst possible behavior. Hence, you won't ever fall into the trap you're worried about. When doing worst-case analysis, and we say "this section of code is n^2", and "this section of code is n^3", as n grows, that will always be true.
You're probably thinking of not worst-case analysis (which is what is typically done for Big O), but average-case analysis. That is, you're not asking about what is the worst possible running time, but on average, what will the typical running time be? That requires a more complicated analysis. Specifically, we have to start giving probability distributions to our inputs. In worst-case analysis, reasoning about the inputs is easy: just figure out what inputs will give us the worst performance, no matter how rare those inputs may be. In average-case analysis, we have to start figuring out what the likely inputs will be. That requires more work and more assumptions. Typically, you do average-case analysis with a particular problem domain in mind - otherwise, you can't reason about what kinds of inputs are frequent or rare.
It is just a simple kind of analysis. Normally people working on algorithm and data structure analysis would do large amounts of fairly advanced math to analyse the complexity in detail, including all the "insignificant" parts.
Apart from worst-case complexity, they would also analyse average complexity, amortized complexity, etc.
for example binary search in array of strings woulde be overkill and performance hog, if the list on average has dozen elements (for example table with methods or commands)
logs and fractional exponents often require deeper analysis. If n=1,000,000 then log n = 6, so n log n would actually be faster than 7n on that size data set.
Certain special cases also require further analysis. Some algorithms run extremely fast, or extremely slow, on specific types of data. For example, with data that's expected to be nearly sorted, some sorting algorithms will run faster than expected while others can choke.
Operations that run at very different speeds also require deeper analysis (or testing). If an algorithm requires O(n) disk reads, O(n^2) memory allocation, and O(n^4) calculations, it's possible for any one of those factors to dominate on real-life data sets.
In terms of just the expression, yes, you just count the largest term. In cases where you count more closely, it's either in the constant (quicksort vs mergesort) or in terms of removing randomness or amortization.
Given your example, you ignore the n^2 because n^12 grows SO much faster. For any "normal" amount of operations, n^12 wipes out 4*n^2 in terms of cost.