Monday, January 28, 2013
#
Unit testing is critical for quality code – they help you:
- find problems early
- facilitate change by avoiding breakage of other functionality
- simplify integration via a bottom up approach
- document the code
- maintain good design – testable code is good code: single intent, clear retval etc
Google has created a unit testing framework for C++ :
http://code.google.com/p/googletest/
I downloaded Google Test from http://code.google.com/p/googletest/downloads/list and built the solution under the msvc folder (Im using Visual C++ today). I see that there is a lib file created here called gtest-1.6.0\msvc\gtest\Debug\gtestd.lib
I then created a new VC++ Win32 Console App and set the following project properties:
VC++ Directories > Include Directories – added path to ..\ gtest-1.6.0\include
VC++ Directories > Library Directories – added path to ..\ gtest-1.6.0\msvc\gtest\Debug
Linker > Input > Additional Dependencies - added gtestd.lib
All swell ?
I thought so, until I compiled & got the dreaded LNK2005: http://msdn.microsoft.com/en-us/library/72zdcz6f(v=vs.80).aspx
WTF – linker errors ?
To quote Cousin Eddie from National Lampoon’s Vacation:
'Everytime Catherine would turn in the microwave I'd piss my pants and forget my name' - C++ often makes me feel like an idiot !
Thank God for Pavel http://blogs.microsoft.co.il/blogs/pavely/ , who pointed out to me that I need to change the C runtime library from DLL multithreaded debug to multithreaded debug:
![clip_image002[5] clip_image002[5]](http://gwb.blob.core.windows.net/joshreuben/Windows-Live-Writer/C-Unit-Testing-with-GoogleTest_A85A/clip_image0025_thumb.gif)
I could now build.
I added an include to access the Google Test macros:
#include "gtest/gtest.h"
I then created a canonical testable function:
int Factorial(int x, int result = 1) {
if (x == 1) return result;
else return Factorial(x - 1, x * result);
}
I used the TEST macro to construct a named test – it leveraged the EXPECT_EQ & EXPECT_GT macros to test for equality and greater than respectably.
TEST(FactorialTest, Negative) {
EXPECT_EQ(1, Factorial(-1));
EXPECT_GT(Factorial(-10), 0);
}
in my main function, I initialized google test & used RUN_ALL_TESTS macro to … run all tests.
int _tmain(int argc, _TCHAR* argv[])
{
testing::InitGoogleTest(&argc,argv);
RUN_ALL_TESTS();
int x;
std::cin >> x;
return 0;
}
I debugged my code - It breaks into the test:

And I get the following test results displayed:

So I am up & running with C++ Google Test !
Further things to try:
All fun & games !
Saturday, January 12, 2013
#
2 years ago I built & presented a course on XNA 4: http://www.e4d.co.il/Events/ExpertDays2011/Courses/Details/29
XNA may be dead & buried, but the 3D programming concepts are tech-stack-agnostic – They apply equally well to DirectX. I’ve extracted this subsection into a set of slides:
enjoy !
Thursday, January 10, 2013
#
I've been reading through the C++ Cookbook (an oldie but a goodie – I assume that 99% of C++ out there is not modern C++, and modern C++ does not mean you don’t need to be able to grok templates, pointers etc – you may need to port something, or use a 3rd party lib)
Anyway, reading through stream manipulators, from my understanding this is how you pass a generic 'delegate' into a constructor & invoke it:
template<typename T, typename C>
class ManipInfra {
public:
ManipInfra (basic_ostream<C>&
(*pFun) (basic_ostream<C>&, T), T val) // pass in
: manipFun_(pFun) // init list – init func pointer var
, val_(val) {}
void operator( )(basic_ostream<C>& os) const
{
manipFun_(os, val_); // Invoke the generic function pointer with the stream and value
}
private:
T val_;
basic_ostream<C>& (*manipFun_) (basic_ostream<C>&, T); // a function pointer – a delegate !
};
Note: C++ 11 has function<T> which provides the same functionality – courtesy of my esteemed colleague Tomer Shamam http://blogs.microsoft.co.il/blogs/tomershamam/ :
class YourClass
{
public:
YourClass(std:function<return_type (param_type param)> func)
{
...
func(param);
}
};
YourClass y([](param_type param){ ...});
Sunday, December 30, 2012
#
Over the past year I went into study overdrive: I learned A LOT about C++, Maths, Algorithms, Finance & JavaScript. It was a great year in terms of knowledge acquisition.
pro amore scientiam !
C++
C / C++ is the language for performant algo dev. It also has a higher barrier of entry than C#. While C# is elegant, it is targeted towards LOB apps and it is very easy to pick up à a proliferation of decent NET code monkeys is a career race to the bottom. It is a wise time to move back to C++. Note: I am about to revise this all again !
· C# to C++ guide – a good refresher
· Revise Kinnect API
· WMF from Pavels course - I worked with this API for 2 months this year – non-trivial !
· ConcRT & PPL – I went through everything on msdn + Alon's course materials. Unlike TPL you can control priority. The agent paradigm is quite similar to Axum, TPL DataFlow.
· AMP – see my post here: http://geekswithblogs.net/JoshReuben/archive/2011/12/04/c-amp.aspx
· Book: C++ AMP http://amzn.to/WWxC2J – a great book with much practical advice.

· Win32, strings, pragmas, intrinsics, lbs vs dlls, project settings, etc etc etc
· COM, ATL – be prepared for a lot of dead code out there !
· STL / TR1 – I wrote a set of exercises for a course on this
· Book: Windows via C/C++ http://amzn.to/WWxO1S - essentially thread management – I never realized how insecure Windows was until I read this !

· WRL & C++/CX - just in case pigs fly !
· C++ 11 – some amazing changes to the standard.
3D
3D is an interesting & non-trivial problem domain. I hope to do further work in this area.
· SIGGRAPH – http://www.siggraph.org/asia2011/hong-kong I attended this conference 1 year ago – an eye opening trip into the future! I participated in intense courses on OpenGL & OpenCL.
· 
· Book: Mathematics for Computer Graphics http://amzn.to/WWxwZ2 - I used this to build an XNA course (before it died)

· Book: DirectX http://amzn.to/WWxj81 - see my summary here: http://geekswithblogs.net/JoshReuben/archive/2012/03/14/d3d-11-programming-in-a-nutshell.aspx

· Direct3D tooling – I'll blog on this sooner or later
· RasterTek DirectX tutorials – http://www.rastertek.com/tutindex.html - the only way to learn DX !
· Book: Unity3D http://amzn.to/WWxbph - I built the island, shot coconuts at the hut door, and started a particle fire !

Trading Systems Infrastructure
· I did a fair few consults for trading firms – they have unique high performant scalable infrastructure requirements – leave your NET hat at the door ! see my blog post here: http://geekswithblogs.net/JoshReuben/archive/2012/03/26/low-latency-high-performant-financial-app-infrastructures.aspx
· Onyx FIX - http://www.onixs.biz/ – a low latency API for messaging with the FIX (financial information exchange) protocol. Critical for trading systems.
· LMAX – I blogged about this trading API here: http://geekswithblogs.net/JoshReuben/archive/2011/07/08/lmax-.net-api-walkthrough.aspx
· Book: Big Data Glossary http://amzn.to/UpRaJs

· NOSQL – did an investigative comparison
· LMAX Disruptor – amazing ring buffer architectural paradigm: http://lmax-exchange.github.com/disruptor/
· Redis - http://redis.io/ - an incredibly fast distributed in memory cache
Maths, Computation & Cognition
Going through life, I feel it is a core personal goal to understand as much as possible about the nature of reality.
· Book: Physics http://amzn.to/WWA610 - a great coffee table book detailing the greatest 500 discoveries in physics

· Schuams Computer Architecture http://amzn.to/UpNXte - an in depth drill down into registers, microcode & memory hierarchies. Very worthwhile read. Hadn't looked at this since my first degree 20 years ago !

· Book: Schaums Vector Analysis http://amzn.to/UpO9IV - nice refresher on linear algebra + some interesting stuff on Stokes Theorem & intro to Tensors

· Book: Schaums Mathematica http://amzn.to/UpPAHm - while this has very little penetration in the Israeli market, it is an amazingly powerful symbolic mathematical programming language environment. I will delve deeper. See my blogpost here: http://geekswithblogs.net/JoshReuben/archive/2012/07/04/mathematica-programming-languagendashan-introduction.aspx

· Book: Excel Scientific Computing http://amzn.to/UpPGyG - everything you need to know about programming excel

· Book: Language instinct - http://amzn.to/UpPPCm - amazing coverage on how language works – will serve me well if I ever head into NLP

· Numerical analysis videos - http://numericalmethods.eng.usf.edu/videos/ - an awesome resource.
· Book: Short Intro to Reality - http://amzn.to/UpQ5Bf - philosophical questions regarding the simulated universe hypothesis – stop thinking like an ant, & read this.

· Book: Short Intro to Sociology - http://amzn.to/UpQbsO - a bit wishy washy, but a good analysis of how & why aspects of cultures are socially constructed

· Book: short intro to Game Theory http://amzn.to/UpQQub - will serve me well if I ever head into constructing a probabilistic reasoning system for decision support.

· Book: Short intro to Nothing http://amzn.to/UpQP9s - Is the Higgs Field the new ether ?

· Book: History of Modern Computing http://amzn.to/UpQV0V - gives you perspective into how fast paradigms can shift

· Book: The Quest for AI http://amzn.to/UpRIPi - a good refresher after reading Norvig & Russell

· Book: C# numerical computing http://amzn.to/UpT9xm - a nice overview via a toy language

· Book: millennium problems http://amzn.to/UpTccv - it is quite amazing to grasp the current limits of our understanding

· Extreme Optimization Library – the strongest C# maths library - http://www.extremeoptimization.com/
· Book: C# ANNs + Encog http://amzn.to/UpTikt - see my blogpost here: http://geekswithblogs.net/JoshReuben/archive/2011/02/04/c-neural-networks-with-encog.aspx

Finance & Economics
The world is rapidly changing – we don’t live in a bubble – its worthwhile to understand the patterns in the undercurrents.
· Book: How the west was lost http://amzn.to/UpTro1

· Book: Cartoon Guide to Macroeconomics http://amzn.to/UpTDUj

· Book: After America http://amzn.to/UpTxMd

· Book: Crash Course http://amzn.to/UpTQql

· Peak Oil http://amzn.to/UpU3K6

JavaScript
I've got a fair few years experience as a web dev – its important to keep abreast of the current Cambrian explosion of JavaScript platforms. JavasScript is pervasive.
· WebGL- a GPU accelerated javascript library for 3D that loads shaders via the canvas 3D context – see my blogpost here: http://geekswithblogs.net/JoshReuben/archive/2012/07/26/an-introduction-to-webgl.aspx
· Book: Javascript pocket guide http://amzn.to/UpUp3v - a good refresher

· JS WebWorkers – true parallelization on the client
· Book: HTML 5 & CSS 3 http://amzn.to/UpUO67 - good overview of whats new

· KnockoutJS – a descent MVVM framework with binding support
· EmberJS – the strongest client side platform – feature rich
· Book: Javascript Design Patterns – good to revise 2 things in one ! http://addyosmani.com/resources/essentialjsdesignpatterns/book/
· Javascript for C# devs http://blog.boyet.com/blog/javascriptlessons/ - a great blog series
· Book: JQuery Pocket Guide http://amzn.to/UpUVP8 - another good refresher

· QUnit – client side unit testing
· JQueryUI
· HTML5 Boilerplate
· Book: ImpactJS HTML5 game programming http://amzn.to/UpV9Wc

· SocketIO , Pusher – WebSockets APIs
· REST API design – consulted on this for a trading client.
· FireBug – amazing debugging capabilities – a must know.
· WebStorm – a great IDE for webdev without the VS bloat
· JSLint / JSHint - code quality tools
· NodeJS – beginners guide http://amzn.to/UpViZR

· NodeJS baby steps & toddler steps - http://elegantcode.com/category/node-js/ - a great blog series
· JS RiverTrail – JavaScript GPGPU – see my blogpost here: http://geekswithblogs.net/JoshReuben/archive/2012/11/29/rivertrail---javascript-gppgu-data-parallelism.aspx
Conclusion: Groundwork laid for 2013
It was definitely a great year in terms of knowledge acquisition.
It is true that I need to consolidate – that’s exactly what I am doing now !
Now, this may all seem freaky to your standard LOB dev tradesman drone, who micro-specializes in XAML & WCF.
I have indeed absorbed a lot of disparate information, and I definitely need to revise & get practical hands-on; however once something is conceptualized, it is yours to rapidly refresh on demand forever. Besides, I have made detailed summaries on various topics.
The whole is greater than the sum of the parts – a worthwhile goal in life is to be a renaissance man: http://en.wikipedia.org/wiki/Polymath
I am not just a jack-of-all-trades – I can quickly drill down & become an expert in many topics.
Maths & C++ have opened up many new worlds for me – which I can now dive into in 2013.
Anyone who thinks I have spread myself too thin is just under-estimating me – don’t tell me what I cannot do !
Tuesday, December 25, 2012
#
I recently read the Big Data Glossary - http://www.amazon.com/Big-Data-Glossary-Pete-Warden/dp/1449314597

Big Data is essentially a MapReduce stack for scatter-gather-aggregate scaleout of compute jobs.
The core tools are:
- Apache Hadoop – a MapReduce scale-out infrastructure
- Hive – SQL language for Hadoop
- Pig – procedural language for Hadoop
- Cascading – orchestration of jobs on Hadoop
- Datameer – BI on Hadoop
- Mahout – distributed machine learning library on Hadoop
- ZooKeeper – work coordinator / monitor
On top of these are various tools & extensions, as well as ports (e.g. HDInsight )
You also need to be aware of elastic cloud platforms to run on, and the various NoSQL DBs tend to be leveraged in this space as well.
Additionally, MapReduce is just an infrastructure pattern for distributed processing of algorithms – you will not get much usage out of it without knowledge of the appropriate algorithms to leverage on the nodes in your compute grid – the whole point of Big Data.
Sunday, December 9, 2012
#
Numerical Analysis – When, What, (but not how)
Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up.
I’ve included links to detailed explanations and to C++ code examples.
I’ve tried to classify Numerical methods in the following broad categories:
- Solving Systems of Linear Equations
- Solving Non-Linear Equations Iteratively
- Interpolation
- Curve Fitting
- Optimization
- Numerical Differentiation & Integration
- Solving ODEs
- Boundary Problems
- Solving EigenValue problems
Enjoy – I did !
Solving Systems of Linear Equations
Overview
Solve sets of algebraic equations with x unknowns
The set is commonly in matrix form
Gauss-Jordan Elimination
http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination
C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx
Produces solution of the equations & the coefficient matrix
Efficient, stable
2 steps:
· Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution
· Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set
Elementary ops for matrix decomposition:
· Row multiplication
· Row switching
· Add multiples of rows to other rows
Use pivoting to ensure rows are ordered for achieving triangular form
LU Decomposition
http://en.wikipedia.org/wiki/LU_decomposition
C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html

Represent the matrix as a product of lower & upper triangular matrices
A modified version of GJ Elimination
Advantage – can easily apply forward & backward elimination to solve triangular matrices
Techniques:
· Doolittle Method – sets the L matrix diagonal to unity
· Crout Method - sets the U matrix diagonal to unity
Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix
Gauss-Seidel Iteration
http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method
C++: http://www.nr.com/forum/showthread.php?t=722
Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively).
an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance
Solving Non-Linear Equations Iteratively
find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial
use iterative techniques
Iterative methods
· used when there are no known analytical techniques
· Requires set functions to be continuous & differentiable
· Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result
· Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met
· bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge
Incremental method
if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically.
Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities
Fixed point method
http://en.wikipedia.org/wiki/Fixed-point_iteration
C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false

Algebraically rearrange a solution to isolate a variable then apply incremental method
Bisection method
http://en.wikipedia.org/wiki/Bisection_method
C++: http://numericalcomputing.wordpress.com/category/algorithms/

Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in
Adv: unaffected by function gradient à reliable
Disadv: slow convergence
False Position Method
http://en.wikipedia.org/wiki/False_position_method
C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/
Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this)
Newton-Raphson method
http://en.wikipedia.org/wiki/Newton's_method
C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3

Also known as Newton's method
Convenient, efficient
Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input.
Evaluates the function & its derivative at each step.
Can be extended to the Newton MutiRoot method for solving multiple roots
Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!!
Secant Method
http://en.wikipedia.org/wiki/Secant_method
C++: http://forum.vcoderz.com/showthread.php?p=205230

Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it
Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step.
Similar implementation to False Positive method
Birge-Vieta Method
http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html
C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false
combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up
Interpolation
Overview
Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation)
Use Taylor Series Expansion of a function f(x) around a specific value for x
Linear Interpolation
http://en.wikipedia.org/wiki/Linear_interpolation
C++: http://www.hamaluik.com/?p=289

Straight line between 2 points à concatenate interpolants between each pair of data points
Bilinear Interpolation
http://en.wikipedia.org/wiki/Bilinear_interpolation
C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/

Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another.
Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell.
Lagrange Interpolation
http://en.wikipedia.org/wiki/Lagrange_polynomial
C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php

For polynomials
Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes
Numerically unstable
Barycentric Interpolation
http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715
C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/

Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x
Newton Divided Difference Interpolation
http://en.wikipedia.org/wiki/Newton_polynomial
C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html
Hermite Divided Differences:

Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences.
For a given set of 3 data points , fit a quadratic interpolant through the data
Bracketed functions allow Newton divided differences to be calculated recursively
Difference table
Cubic Spline Interpolation
http://en.wikipedia.org/wiki/Spline_interpolation
C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html

Spline is a piecewise polynomial
Provides smoothness – for interpolations with significantly varying data
Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval
Curve Fitting
A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points
Least Squares Fit
http://en.wikipedia.org/wiki/Least_squares
C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c

Residual – difference between observed value & expected value
Model function is often chosen as a linear combination of the specified functions
Determines:
A) The model instance in which the sum of squared residuals has the least value
B) param values for which model best fits data
Straight Line Fit
Linear correlation between independent variable and dependent variable
Linear Regression
http://en.wikipedia.org/wiki/Linear_regression
C++: http://www.oocities.org/david_swaim/cpp/linregc.htm

Special case of statistically exact extrapolation
Leverage least squares
Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition)
Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights
Polynomial Fit - use a polynomial basis function
Moving Average
http://en.wikipedia.org/wiki/Moving_average
C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm

Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters
Replace each data point with average of neighbors.
Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point.
Parameters: smoothing factor, period, weight basis
Optimization
Overview
Given function with multiple variables, find Min (or max by minimizing –f(x))
Iterative approach
Efficient, but not necessarily reliable
Conditions: noisy data, constraints, non-linear models
Detection via sign of first derivative - Derivative of saddle points will be 0
Local minima
Bisection method
Similar method for finding a root for a non-linear equation
Start with an interval that contains a minimum
Golden Search method
http://en.wikipedia.org/wiki/Golden_section_search
C++: http://www.codecogs.com/code/maths/optimization/golden.php

Bisect intervals according to golden ratio 0.618..
Achieves reduction by evaluating a single function instead of 2
Newton-Raphson Method
Brent method
http://en.wikipedia.org/wiki/Brent's_method
C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp
Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0
Simplex Method
http://en.wikipedia.org/wiki/Simplex_algorithm
C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm

Find the global minima of any multi-variable function
Direct search – no derivatives required
At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices.
Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point.
Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point
Oscillation can be avoided by choosing the 2nd worst result
Restart if it gets stuck
Parameters: contraction & expansion factors
Simulated Annealing
http://en.wikipedia.org/wiki/Simulated_annealing
C++: http://code.google.com/p/cppsimulatedannealing/

Analogy to heating & cooling metal to strengthen its structure
Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing
Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta
Cooling schedule – can be linear, step-wise or exponential
Differential Evolution
http://en.wikipedia.org/wiki/Differential_evolution
C++: http://www.amichel.com/de/doc/html/
More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies
Parallel direct search method against multiple discrete or continuous variables
Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector
Many params: #parents, #variables, step size, crossover constant etc
Convergence is slow – many more function evaluations than simulated annealing
Numerical Differentiation
Overview
2 approaches to finite difference methods:
· A) approximate function via polynomial interpolation then differentiate
· B) Taylor series approximation – additionally provides error estimate
Finite Difference methods
http://en.wikipedia.org/wiki/Finite_difference_method
C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf

Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points
Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h.
Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved.
Provide an approximation of the derivative within a O(h^2) accuracy
There is also central difference & extended central difference which has a O(h^4) accuracy
Richardson Extrapolation
http://en.wikipedia.org/wiki/Richardson_extrapolation
C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html
A sequence acceleration method applied to finite differences
Fast convergence, high accuracy O(h^4)
Derivatives via Interpolation
Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation
Note: the higher the order of the derivative, the lower the approximation precision
Numerical Integration
Estimate finite & infinite integrals of functions
More accurate procedure than numerical differentiation
Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are
Newton Cotes Methods
http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas
C++: http://www.siafoo.net/snippet/324

For equally spaced data points
Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area
Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum
Weights are derived from Lagrange Basis polynomials
Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results
Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint
Romberg Integration
http://en.wikipedia.org/wiki/Romberg's_method
C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q=
Combines trapezoidal rule with Richardson Extrapolation
Evaluates the integrand at equally spaced points
The integrand must have continuous derivatives
Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small.
Gaussian Quadrature
http://en.wikipedia.org/wiki/Gaussian_quadrature
C++: http://www.alglib.net/integration/gaussianquadratures.php
Data points are chosen to yield best possible accuracy – requires fewer evaluations
Ability to handle singularities, functions that are difficult to evaluate
The integrand can include a weighting function determined by a set of orthogonal polynomials.
Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1
Techniques (basically different weighting functions):
· Gauss-Legendre Integration w(x)=1
· Gauss-Laguerre Integration w(x)=e^-x
· Gauss-Hermite Integration w(x)=e^-x^2
· Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2)
Solving ODEs
Use when high order differential equations cannot be solved analytically
Evaluated under boundary conditions
RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations
Euler method
http://en.wikipedia.org/wiki/Euler_method
C++: http://rosettacode.org/wiki/Euler_method

First order Runge–Kutta method.
Simple recursive method – given an initial value, calculate derivative deltas.
Unstable & not very accurate (O(h) error) – not used in practice
A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size
In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries
Higher order Runge Kutta
http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods
C++: http://www.dreamincode.net/code/snippet1441.htm

2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost
Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced
Boundary Value Problems
Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution
An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied
Shooting Method
http://en.wikipedia.org/wiki/Shooting_method
C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html

Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate
Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations.
Finite Difference Method
For linear & non-linear systems
Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though
Improve the accuracy by increasing the number of mesh points
Solving EigenValue Problems
An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue
For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors
Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively

Jacobi method
http://en.wikipedia.org/wiki/Jacobi_method
C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html
Robust but Computationally intense – use for small matrices < 10x10
Power Iteration
http://en.wikipedia.org/wiki/Power_iteration
For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors
Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices
Inverse Iteration
Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix
Rayleigh Method
http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis
Variation of power iteration method
Rayleigh Quotient Method
Variation of inverse iteration method
Matrix Tri-diagonalization Method
Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms
Whats Next
Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades:
Sooner or later, I’ll cover the above topics as well.
Thursday, November 29, 2012
#
I am about to embark on a great journey – over the next 6 weeks I plan to read through C++ Numerical Recipes 3rd edition http://amzn.to/YtdpkS

I'll be reading this with an eye to C++ AMP, thinking about implementing the suitable subset (non-recursive, additive, commutative) to run on the GPU.
APIs supporting HPC, GPGPU or MapReduce are all useful – providing you have the ability to choose the correct algorithm to leverage on them.
I really think this is the most fascinating area of programming – a lot more exciting than LOB CRUD !!!
When you think about it , everything is a function – we categorize & we extrapolate.
As abstractions get higher & less leaky, sooner or later information systems programming will become a non-programmer task – you will be using WYSIWYG designers to build:
- GUIs
- MVVM
- service mapping & virtualization
- workflows
- ORM
- Entity relations In the data source
SharePoint / LightSwitch are not there yet, but every iteration gets closer.
For information workers, managed code is a race to the bottom.
As MS futures are a bit shaky right now, the provider agnostic nature & higher barriers of entry of both C++ & Numerical Analysis seem like a rational choice to me.
Its also fascinating – stepping outside the box.
This is not the first time I've delved into numerical analysis.
6 months ago I read Numerical methods with Applications, which can be found for free online: http://nm.mathforcollege.com/

2 years ago I learned the .NET Extreme Optimization library www.extremeoptimization.com – not bad
2.5 years ago I read Schaums Numerical Analysis book http://amzn.to/V5yuLI - not an easy read, as topics jump back & forth across chapters:

3 years ago I read Practical Numerical Methods with C# http://amzn.to/V5yCL9 (which is a toy learning language for this kind of stuff)

I also read through AI a Modern Approach 3rd edition END to END http://amzn.to/V5yQSp - this took me a few years but was the most rewarding experience.

I'll post progress updates – see you on the other side !
Where is WebCL ?
The Khronos WebCL working group is working on a JavaScript binding to the OpenCL standard so that HTML 5 compliant browsers can host GPGPU web apps – e.g. for image processing or physics for WebGL games - http://www.khronos.org/webcl/ . While Nokia & Samsung have some protype WebCL APIs, Intel has one-upped them with a higher level of abstraction: RiverTrail.
Intro to RiverTrail
Intel Labs JavaScript RiverTrail provides GPU accelerated SIMD data-parallelism in web applications via a familiar JavaScript programming paradigm. It extends JavaScript with simple deterministic data-parallel constructs that are translated at runtime into a low-level hardware abstraction layer. With its high-level JS API, programmers do not have to learn a new language or explicitly manage threads, orchestrate shared data synchronization or scheduling. It has been proposed as a draft specification to ECMA a (known as ECMA strawman).
RiverTrail runs in all popular browsers (except I.E. of course).
To get started, download a prebuilt version https://github.com/downloads/RiverTrail/RiverTrail/rivertrail-0.17.xpi , install Intel's OpenCL SDK http://www.intel.com/go/opencl and try out the interactive River Trail shell http://rivertrail.github.com/interactive
For a video overview, see http://www.youtube.com/watch?v=jueg6zB5XaM .
ParallelArray
the ParallelArray type is the central component of this API & is a JS object that contains ordered collections of scalars – i.e. multidimensional uniform arrays. A shape property describes the dimensionality and size– e.g. a 2D RGBA image will have shape [height, width, 4]. ParallelArrays are immutable & fluent – they are manipulated by invoking methods on them which produce new ParallelArray objects. ParallelArray supports several constructors over arrays, functions & even the canvas.
// Create an empty Parallel Array
var pa = new ParallelArray();
// pa0 = <>
// Create a ParallelArray out of a nested JS array.
// Note that the inner arrays are also ParallelArrays
var pa = new ParallelArray([ [0,1], [2,3], [4,5] ]);
// pa1 = <<0,1>, <2,3>, <4.5>>
// Create a two-dimensional ParallelArray with shape [3, 2] using the comprehension constructor
var pa = new ParallelArray([3, 2], function(iv){return iv[0] * iv[1];});
// pa7 = <<0,0>, <0,1>, <0,2>>
// Create a ParallelArray from canvas. This creates a PA with shape [w, h, 4],
var pa = new ParallelArray(canvas);
// pa8 = CanvasPixelArray
ParallelArray exposes fluent API functions that take an elemental JS function for data manipulation: map, combine, scan, filter, and scatter that return a new ParallelArray. Other functions are scalar - reduce returns a scalar value & get returns the value located at a given index.
The onus is on the developer to ensure that the elemental function does not defeat data parallelization optimization (avoid global var manipulation, recursion).
For reduce & scan, order is not guaranteed - the onus is on the dev to provide an elemental function that is commutative and associative so that scan will be deterministic – E.g. Sum is associative, but Avg is not.
map
Applies a provided elemental function to each element of the source array and stores the result in the corresponding position in the result array. The map method is shape preserving & index free - can not inspect neighboring values.
// Adding one to each element.
var source = new ParallelArray([1,2,3,4,5]);
var plusOne = source.map(function inc(v) {
return v+1; }); //<2,3,4,5,6>
combine
Combine is similar to map, except an index is provided. This allows elemental functions to access elements from the source array relative to the one at the current index position. While the map method operates on the outermost dimension only, combine, can choose how deep to traverse - it provides a depth argument to specify the number of dimensions it iterates over. The elemental function of combine accesses the source array & the current index within it - element is computed by calling the get method of the source ParallelArray object with index i as argument. It requires more code but is more expressive.
var source = new ParallelArray([1,2,3,4,5]);
var plusOne = source.combine(function inc(i) { return this.get(i)+1; });
reduce
reduces the elements from an array to a single scalar result – e.g. Sum.
// Calculate the sum of the elements
var source = new ParallelArray([1,2,3,4,5]);
var sum = source.reduce(function plus(a,b) { return a+b; });
scan
Like reduce, but stores the intermediate results – return a ParallelArray whose ith elements is the results of using the elemental function to reduce the elements between 0 and I in the original ParallelArray.
// do a partial sum
var source = new ParallelArray([1,2,3,4,5]);
var psum = source.scan(function plus(a,b) { return a+b; }); //<1, 3, 6, 10, 15>
scatter
a reordering function - specify for a certain source index where it should be stored in the result array.
An optional conflict function can prevent an exception if two source values are assigned the same position of the result:
var source = new ParallelArray([1,2,3,4,5]);
var reorder = source.scatter([4,0,3,1,2]); // <2, 4, 5, 3, 1>
// if there is a conflict use the max. use 33 as a default value.
var reorder = source.scatter([4,0,3,4,2], 33, function max(a, b) {return a>b?a:b; }); //<2, 33, 5, 3, 4>
filter
// filter out values that are not even
var source = new ParallelArray([1,2,3,4,5]);
var even = source.filter(function even(iv) { return (this.get(iv) % 2) == 0; }); // <2,4>
Flatten
used to collapse the outer dimensions of an array into a single dimension.
pa = new ParallelArray([ [1,2], [3,4] ]); // <<1,2>,<3,4>>
pa.flatten(); // <1,2,3,4>
Partition
used to restore the original shape of the array.
var pa = new ParallelArray([1,2,3,4]); // <1,2,3,4>
pa.partition(2); // <<1,2>,<3,4>>
Get
return value found at the indices or undefined if no such value exists.
var pa = new ParallelArray([0,1,2,3,4], [10,11,12,13,14], [20,21,22,23,24])
pa.get([1,1]); // 11
pa.get([1]); // <10,11,12,13,14>
Sunday, October 28, 2012
#
ASP.NET Web API is an ideal platform for building RESTful applications on the .NET Framework. While I may be more partial to NodeJS these days, there is no denying that WebAPI is a well engineered framework.
What follows is my investigation of how to leverage WebAPI to construct a RESTful frontend API.
The Advantages of REST Methodology over SOAP
- Simpler API for CRUD ops
- Standardize Development methodology - consistent and intuitive
- Standards based à client interop
- Wide industry adoption, Ease of use à easy to add new devs
- Avoid service method signature blowout
- Smaller payloads than SOAP
- Stateless à no session data means multi-tenant scalability
- Cache-ability
- Testability
· utilize HTTP Protocol - Usage of HTTP methods for CRUD, standard HTTP response codes, common HTTP headers and Mime Types
· Resources are mapped to URLs, actions are mapped to verbs and the rest goes in the headers.
· keep the API semantic, resource-centric – A RESTful, resource-oriented service exposes a URI for every piece of data the client might want to operate on. A REST-RPC Hybrid exposes a URI for every operation the client might perform: one URI to fetch a piece of data, a different URI to delete that same data. utilize Uri to specify CRUD op, version, language, output format:
http://api.MyApp.com/{ver}/{lang}/{resource_type}/{resource_id}.{output_format}?{key&filters}
· entity CRUD operations are matched to HTTP methods:
- · Create - POST / PUT
- · Read – GET - cacheable
- · Update – PUT
- · Delete - DELETE
· Use Uris to represent a hierarchies - Resources in RESTful URLs are often chained
· Statelessness allows for idempotency – apply an op multiple times without changing the result. POST is non-idempotent, the rest are idempotent (if DELETE flags records instead of deleting them).
· Cache indication - Leverage HTTP headers to label cacheable content and indicate the permitted duration of cache
· PUT vs POST - The client uses PUT when it determines which URI (Id key) the new resource should have. The client uses POST when the server determines they key. PUT takes a second param – the id. POST creates a new resource. The server assigns the URI for the new object and returns this URI as part of the response message. Note: The PUT method replaces the entire entity. That is, the client is expected to send a complete representation of the updated product. If you want to support partial updates, the PATCH method is preferred
DELETE deletes a resource at a specified URI – typically takes an id param
· Leverage Common HTTP Response Codes in response headers
- 200 OK: Success
- 201 Created - Used on POST request when creating a new resource.
- 304 Not Modified: no new data to return.
- 400 Bad Request: Invalid Request.
- 401 Unauthorized: Authentication.
- 403 Forbidden: Authorization
- 404 Not Found – entity does not exist.
- 406 Not Acceptable – bad params.
- 409 Conflict - For POST / PUT requests if the resource already exists.
- 500 Internal Server Error
- 503 Service Unavailable
· Leverage uncommon HTTP Verbs to reduce payload sizes
- HEAD - retrieves just the resource meta-information.
- OPTIONS returns the actions supported for the specified resource.
- PATCH - partial modification of a resource.
· When using PUT, POST or PATCH, send the data as a document in the body of the request. Don't use query parameters to alter state.
· Utilize Headers for content negotiation, caching, authorization, throttling
o Content Negotiation – choose representation (e.g. JSON or XML and version), language & compression. Signal via RequestHeader.Accept & ResponseHeader.Content-Type
Accept: application/json;version=1.0
Accept-Language: en-US
Accept-Charset: UTF-8
Accept-Encoding: gzip
o Caching - ResponseHeader: Expires (absolute expiry time) or Cache-Control (relative expiry time)
o Authorization - basic HTTP authentication uses the RequestHeader.Authorization to specify a base64 encoded string "username:password". can be used in combination with SSL/TLS (HTTPS) and leverage OAuth2 3rd party token-claims authorization.
Authorization: Basic sQJlaTp5ZWFslylnaNZ=
o Rate Limiting - Not currently part of HTTP so specify non-standard headers prefixed with X- in the ResponseHeader.
X-RateLimit-Limit: 10000
X-RateLimit-Remaining: 9990
· HATEOAS Methodology - Hypermedia As The Engine Of Application State – leverage API as a state machine where resources are states and the transitions between states are links between resources and are included in their representation (hypermedia) – get API metadata signatures from the response Link header - in a truly REST based architecture any URL, except the initial URL, can be changed, even to other servers, without worrying about the client.
· error responses - Do not just send back a 200 OK with every response. Response should consist of HTTP error status code (JQuery has automated support for this), A human readable message , A Link to a meaningful state transition , & the original data payload that was problematic.
· the URIs will typically map to a server-side controller and a method name specified by the type of request method. Stuff all your calls into just four methods is not as crazy as it sounds.
· Scoping - Path variables look like you’re traversing a hierarchy, and query variables look like you’re passing arguments into an algorithm
· Mapping URIs to Controllers - have one controller for each resource is not a rule – can consolidate - route requests to the appropriate controller and action method
· Keep URls Consistent - Sometimes it’s tempting to just shorten our URIs. not recommend this as this can cause confusion
· Join Naming – for m-m entity relations there may be multiple hierarchy traversal paths
· Routing – useful level of indirection for versioning, server backend mocking in development
ASPNET WebAPI implements a lot (but not all) RESTful API design considerations as part of its infrastructure and via its coding convention.
When developing an API there are basically three main steps:
1. Plan out your URIs
2. Setup return values and response codes for your URIs
3. Implement a framework for your API.
· Leverage Models MVC folder
· Repositories – support IoC for tests, abstraction
· Create DTO classes – a level of indirection decouples & allows swap out
· Self links can be generated using the UrlHelper
· Use IQueryable to support projections across the wire
· Models can support restful navigation properties – ICollection<T>
· async mechanism for long running ops - return a response with a ticket – the client can then poll or be pushed the final result later.
· Design for testability - Test using HttpClient , JQuery ( $.getJSON , $.each) , fiddler, browser debug. Leverage IDependencyResolver – IoC wrapper for mocking
· Easy debugging - IE F12 developer tools: Network tab, Request Headers tab
· HTTP request method is matched to the method name. (This rule applies only to GET, POST, PUT, and DELETE requests.)
· {id}, if present, is matched to a method parameter named id.
· Query parameters are matched to parameter names when possible
· Done in config via Routes.MapHttpRoute – similar to MVC routing
· Can alternatively:
- o decorate controller action methods with HttpDelete, HttpGet, HttpHead,HttpOptions, HttpPatch, HttpPost, or HttpPut., + the ActionAttribute
- o use AcceptVerbsAttribute to support other HTTP verbs: e.g. PATCH, HEAD
- o use NonActionAttribute to prevent a method from getting invoked as an action
· route table Uris can support placeholders (via curly braces{}) – these can support default values and constraints, and optional values
· The framework selects the first route in the route table that matches the URI.
· Response code: By default, the Web API framework sets the response status code to 200 (OK). But according to the HTTP/1.1 protocol, when a POST request results in the creation of a resource, the server should reply with status 201 (Created). Non Get methods should return HttpResponseMessage
· Location: When the server creates a resource, it should include the URI of the new resource in the Location header of the response.
public HttpResponseMessage PostProduct(Product item)
{
item = repository.Add(item);
var response = Request.CreateResponse<Product>(HttpStatusCode.Created, item);
string uri = Url.Link("DefaultApi", new { id = item.Id });
response.Headers.Location = new Uri(uri);
return response;
}
· Decorate Models / DTOs with System.ComponentModel.DataAnnotations properties RequiredAttribute, RangeAttribute.
· Check payloads using ModelState.IsValid
· Under posting – leave out values in JSON payload à JSON formatter assigns a default value. Use with RequiredAttribute
· Over-posting - if model has RO properties à use DTO instead of model
· Can hook into pipeline by deriving from ActionFilterAttribute & overriding OnActionExecuting
· Done in App_Start folder > WebApiConfig.cs – static Register method: HttpConfiguration param: The HttpConfiguration object contains the following members.
| Member | Description |
| DependencyResolver | Enables dependency injection for controllers. |
| Filters | Action filters – e.g. exception filters. |
| Formatters | Media-type formatters. by default contains JsonFormatter, XmlFormatter |
| IncludeErrorDetailPolicy | Specifies whether the server should include error details, such as exception messages and stack traces, in HTTP response messages. |
| Initializer | A function that performs final initialization of the HttpConfiguration. |
| MessageHandlers | HTTP message handlers - plug into pipeline |
| ParameterBindingRules | A collection of rules for binding parameters on controller actions. |
| Properties | A generic property bag. |
| Routes | The collection of routes. |
| Services | The collection of services. |
· Configure JsonFormatter for circular references to support links: PreserveReferencesHandling.Objects
· create a help page for a web API, by using the ApiExplorer class.
· The ApiExplorer class provides descriptive information about the APIs exposed by a web API as an ApiDescription collection
· create the help page as an MVC view
public ILookup<string, ApiDescription> GetApis()
{
return _explorer.ApiDescriptions.ToLookup(
api => api.ActionDescriptor.ControllerDescriptor.ControllerName);
· provide documentation for your APIs by implementing the IDocumentationProvider interface. Documentation strings can come from any source that you like – e.g. extract XML comments or define custom attributes to apply to the controller
[ApiDoc("Gets a product by ID.")]
[ApiParameterDoc("id", "The ID of the product.")]
public HttpResponseMessage Get(int id)
· GlobalConfiguration.Configuration.Services – add the documentation Provider
· To hide an API from the ApiExplorer, add the ApiExplorerSettingsAttribute
· Plug into request / response pipeline – derive from DelegatingHandler and override theSendAsync method – e.g. for logging error codes, adding a custom response header
· Can be applied globally or to a specific route
· Throw HttpResponseException on method failures – specify HttpStatusCode enum value – examine this enum, as its values map well to typical op problems
· Exception filters – derive from ExceptionFilterAttribute & override OnException. Apply on Controller or action methods, or add to global HttpConfiguration.Filters collection
· HttpError object provides a consistent way to return error information in the HttpResponseException response body.
· For model validation, you can pass the model state to CreateErrorResponse, to include the validation errors in the response
public HttpResponseMessage PostProduct(Product item)
{
if (!ModelState.IsValid)
{
return Request.CreateErrorResponse(HttpStatusCode.BadRequest, ModelState);
· Cookie header in request and Set-Cookie headers in a response - Collection of CookieState objects
· Specify Expiry, max-age
resp.Headers.AddCookies(new CookieHeaderValue[] { cookie });
· Defaults to application/json
· Request Accept header and response Content-Type header
· determines how Web API serializes and deserializes the HTTP message body. There is built-in support for XML, JSON, and form-urlencoded data
· customizable formatters can be inserted into the pipeline
· POCO serialization is opt out via JsonIgnoreAttribute, or use DataMemberAttribute for optin
· JSON serializer leverages NewtonSoft Json.NET
· loosely structured JSON objects are serialzed as JObject which derives from Dynamic
· to handle circular references in json:
json.SerializerSettings.PreserveReferencesHandling = PreserveReferencesHandling.All à {"$ref":"1"}.
· To preserve object references in XML
[DataContract(IsReference=true)]
· Content negotiation
- Accept: Which media types are acceptable for the response, such as “application/json,” “application/xml,” or a custom media type such as "application/vnd.example+xml"
- Accept-Charset: Which character sets are acceptable, such as UTF-8 or ISO 8859-1.
- Accept-Encoding: Which content encodings are acceptable, such as gzip.
- Accept-Language: The preferred natural language, such as “en-us”.
o Web API uses the Accept and Accept-Charset headers. (At this time, there is no built-in support for Accept-Encoding or Accept-Language.)
· Controller methods can take JSON representations of DTOs as params – auto-deserialization
· Typical JQuery GET request:
function find() {
var id = $('#prodId').val();
$.getJSON("api/products/" + id,
function (data) {
var str = data.Name + ': $' + data.Price;
$('#product').text(str);
})
.fail(
function (jqXHR, textStatus, err) {
$('#product').text('Error: ' + err);
});
}
· Typical GET response:
HTTP/1.1 200 OK
Server: ASP.NET Development Server/10.0.0.0
Date: Mon, 18 Jun 2012 04:30:33 GMT
X-AspNet-Version: 4.0.30319
Cache-Control: no-cache
Pragma: no-cache
Expires: -1
Content-Type: application/json; charset=utf-8
Content-Length: 175
Connection: Close
[{"Id":1,"Name":"TomatoSoup","Price":1.39,"ActualCost":0.99},{"Id":2,"Name":"Hammer", "Price":16.99,"ActualCost":10.00},{"Id":3,"Name":"Yo yo","Price":6.99,"ActualCost": 2.05}]
· Leverage Query Options $filter, $orderby, $top and $skip to shape the results of controller actions annotated with the [Queryable]attribute.
[Queryable]
public IQueryable<Supplier> GetSuppliers()
· Query:
~/Suppliers?$filter=Name eq ‘Microsoft’
· Applies the following selection filter on the server:
GetSuppliers().Where(s => s.Name == “Microsoft”)
· Will pass the result to the formatter.
· true support for the OData format is still limited - no support for creates, updates, deletes, $metadata and code generation etc
· vnext: ability to configure how EditLinks, SelfLinks and Ids are generated
Self Hosting
no dependency on ASPNET or IIS:
using (var server = new HttpSelfHostServer(config))
{
server.OpenAsync().Wait();
· tracability tools, metrics – e.g. send to nagios
· use your choice of tracing/logging library, whether that is ETW,NLog, log4net, or simply System.Diagnostics.Trace.
· To collect traces, implement the ITraceWriter interface
public class SimpleTracer : ITraceWriter
{
public void Trace(HttpRequestMessage request, string category, TraceLevel level,
Action<TraceRecord> traceAction)
{
TraceRecord rec = new TraceRecord(request, category, level);
traceAction(rec);
WriteTrace(rec);
· register the service with config
· programmatically trace – has helper extension methods:
Configuration.Services.GetTraceWriter().Info(
· Performance tracing - pipeline writes traces at the beginning and end of an operation - TraceRecord class includes aTimeStamp property, Kind property set to TraceKind.Begin / End
· Roles class methods: RoleExists, AddUserToRole
· WebSecurity class methods: UserExists, .CreateUserAndAccount
· Request.IsAuthenticated
· Leverage HTTP 401 (Unauthorized) response
· [AuthorizeAttribute(Roles="Administrator")] – can be applied to Controller or its action methods
· See section in WebApi document on "Claim-based-security for ASP.NET Web APIs using DotNetOpenAuth" – adapt this to STS.--> Web API Host exposes secured Web APIs which can only be accessed by presenting a valid token issued by the trusted issuer. http://zamd.net/2012/05/04/claim-based-security-for-asp-net-web-apis-using-dotnetopenauth/
· Use MVC membership provider infrastructure and add a DelegatingHandler child class to the WebAPI pipeline - http://stackoverflow.com/questions/11535075/asp-net-mvc-4-web-api-authentication-with-membership-provider - this will perform the login actions
· Then use AuthorizeAttribute on controllers and methods for role mapping- http://sixgun.wordpress.com/2012/02/29/asp-net-web-api-basic-authentication/
· Alternate option here is to rely on MVC App : http://forums.asp.net/t/1831767.aspx/1
Wednesday, October 10, 2012
#
HPC Job Types
HPC has 3 types of jobs http://technet.microsoft.com/en-us/library/cc972750(v=ws.10).aspx
· Task Flow – vanilla sequence

· Parametric Sweep – concurrently run multiple instances of the same program, each with a different work unit input

· MPI – message passing between master & slave tasks

But when you try go outside the box – job tasks that spawn jobs, blocking the parent task – you run the risk of resource starvation, deadlocks, and recursive, non-converging or exponential blow-up.
The solution to this is to write some performance monitoring and job scheduling code. You can do this in 2 ways:
- manually control scheduling - allocate/ de-allocate resources, change job priorities, pause & resume tasks , restrict long running tasks to specific compute clusters
- Semi-automatically - set threshold params for scheduling.
How – Control Job Scheduling
In order to manage the tasks and resources that are associated with a job, you will need to access the ISchedulerJob interface - http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulerjob_members(v=vs.85).aspx
This really allows you to control how a job is run – you can access & tweak the following features:
- max / min resource values
whether job resources can grow / shrink, and whether jobs can be pre-empted, whether the job is exclusive per node
the creator process id & the job pool - timestamp of job creation & completion
job priority, hold time & run time limit - Re-queue count
- Job progress
- Max/ min Number of cores, nodes, sockets, RAM
- Dynamic task list – can add / cancel jobs on the fly
- Job counters
When – poll perf counters
Tweaking the job scheduler should be done on the basis of resource utilization according to PerfMon counters – HPC exposes 2 Perf objects: Compute Clusters, Compute Nodes
http://technet.microsoft.com/en-us/library/cc720058(v=ws.10).aspx
You can monitor running jobs according to dynamic thresholds – use your own discretion:
- Percentage processor time
- Number of running jobs
- Number of running tasks
- Total number of processors
- Number of processors in use
- Number of processors idle
- Number of serial tasks
- Number of parallel tasks
Design Your algorithms correctly
Finally , don’t assume you have unlimited compute resources in your cluster – design your algorithms with the following factors in mind:
· Branching factor - http://en.wikipedia.org/wiki/Branching_factor - dynamically optimize the number of children per node

· cutoffs to prevent explosions - http://en.wikipedia.org/wiki/Limit_of_a_sequence - not all functions converge after n attempts. You also need a threshold of good enough, diminishing returns
· heuristic shortcuts - http://en.wikipedia.org/wiki/Heuristic - sometimes an exhaustive search is impractical and short cuts are suitable
· Pruning http://en.wikipedia.org/wiki/Pruning_(algorithm) – remove / de-prioritize unnecessary tree branches

· avoid local minima / maxima - http://en.wikipedia.org/wiki/Local_minima - sometimes an algorithm cant converge because it gets stuck in a local saddle – try simulated annealing, hill climbing or genetic algorithms to get out of these ruts

watch out for rounding errors – http://en.wikipedia.org/wiki/Round-off_error - multiple iterations can in parallel can quickly amplify & blow up your algo ! Use an epsilon, avoid floating point errors, truncations, approximations
Happy Coding !