Principal Data Scientist, Director for Data Science, AI, Big Data Technologies. O’Reilly author on distributed computing and machine learning. ​

Natalino leads the definition, design and implementation of data-driven financial and telecom applications. He has previously served as Enterprise Data Architect at ING in the Netherlands, focusing on fraud prevention/detection, SoC, cybersecurity, customer experience, and core banking processes.

​Prior to that, he had worked as senior researcher at Philips Research Laboratories in the Netherlands, on the topics of system-on-a-chip architectures, distributed computing and compilers. All-round Technology Manager, Product Developer, and Innovator with 15+ years track record in research, development and management of distributed architectures, scalable services and data-driven applications.

Saturday, August 31, 2013

Streaming Computing

An overview about parallelism and the tradoffs between troughput and latency, which translates into batch vs realtime.

A brief description of the reasons, followed by a selection of technologies.

Towards an architecture which can combine the benefit of batch distributed systems capable of crunching petabytes of data on one side and distributed event-driven systems of the other side capable of processing thousands of events per second.