15th ~ 16th OCT 2015 MADRID, SPAIN #BDS15
THE 4th EDITION OF BIG DATA IN Oct 2015 WAS A RESOUNDING SUCCESS.
University of A CoruñaIntelligence Artificial Lab Head
Amparo Alonso-Betanzos is a Professor at the Department of Computer Science and coordinator of the Laboratory for Research and Development in Artificial Intelligence at the University of A Coruña. She is currently working on the development of Machine Learning algorithms, and on their applications to several fields, such as predictive maintenance on engineering or prediction of gene expression in bioinformatics.
Topic
BEGIN AT THE BEGINNING: FEATURE SELECTION FOR BIG DATA
Thursday 16th
15:15 pm to 16:00 pm
Room 25
AllegroJava Developer
Maciej Arciuch
Maciek works at Allegro on developing a scalable, distributed data ingestion and analysis system. Former Java EE developer, now a Scala, Spark and functional programming enthusiast.
Topic
A REAL-TIME DATA INGESTION SYSTEM OR: HOW I LEARNED TO STOP WORRYING AND TAKE GOOD ADVICES
Friday 16th
13:30 pm to 14:15 pm
Room 19
IICChief Data Scientist
As a Data Scientist at IIC, Álvaro Barbero has taken part in several Big Data projects, involving various fields of application such as fraud detection, sentiment analysis on social networks, demand forecasting and stocks optimization. As part of his research and academic activities, he has authored 30 international research papers and has been lecturer in MSc-level courses on data mining and machine learning.
Topic
BEYOND ANALYTICS: PRESCRIPTIVE ANALYTICS FOR THE FUTURE OF YOUR BUSINESS
Friday 16th
13:30 pm to 14:15 pm
Room 19
ITRS GroupSoftware Developer
Michael works at ITRS Group in Malaga building distributed systems for streaming analytics. Coming from a background in monitoring software for investment banks, he now spends his time using Scala and Akka to maintain highly available real-time streams and queries. By night he plays with audio programming and signal processing and recently finished a spell preparing teachers in the UK for the introduction of the new Computing curriculum.
Topic
A NEW STREAMING COMPUTATION ENGINE FOR REAL-TIME ANALYTICS
Thursday 15th
16:00 pm to 16:45 pm
Room 19
CERNHead of the Control and Monitoring Platform
Matthias Braeger
Matthias Braeger is Software Engineer at CERN, the European Organization for Nuclear Research. He is responsible for the Technical Infrastructure Monitoring (TIM) system which is a 24/7 service used by many different user groups to supervise and control a variety of infrastructure spread across the CERN site. He also heads the CERN Control and Monitoring Platform C2MON [C-2-MON].
ESAEUCLID SCIENCE OPERATIONS CENTER SYSTEM ENGINEER
Guillermo Buenadicha
Guillermo Buenadicha is originally from Avila, Spain, and studied Telecommunications Engineering at the Polytechnic University of Madrid.
He joined ESA in 1995, initially working on the ISO mission as a platform operations engineer at Villafranca, now ESAC (European Space Astronomy Centre). In 2000, he returned to ESAC to join the XMM-Newton science operations team, and in 2007 moved to the Planck science operations team. Since January 2008, he has been working at ESAC as SMOS Payload Engineer.
INGData Architect
Natalino is currently data architect at Ing Retail in the Netherlands, where he leads the definition, design and implementation of big/fast data solutions for data-driven financial applications such as personalized marketing and predictive analytics. All-round Software Architect, Data Technologist, Innovator, with 15+ years experience in research, development and management of distributed architectures and scalable services and applications. Previously served as senior researcher at Philips Research Laboratories in the Netherlands, on the topics of system-on-a-chip architectures, distributed computing and parallelizing compilers. Blogs regularly about data analytics, data science and scala reactive programming at natalinobusa.com
Topic
REAL-TIME ANOMALY DETECTION WITH CASSANDRA, SPARK ML AND AKKA
Friday 16th
17:15 pm to 18:00 pm
Room 25
TreelogicPhD. Senior Researcher
Rubén Casado has worked as Assistant Professor at University of Oviedo. He has been involved in many research projects, including both national and international scope. Currently he is the responsible of Big Data research program at Treelogic company. His research interests include Big Data, Distributed Data Management and Software Testing.
Topic
PROTEUS: SCALABLE ONLINE MACHINE LEARNING FOR PREDICTIVE ANALYTICS
Friday 16th
16:15 pm to 17:00 pm
Room 19
DatKnoSysCTO
Isaac Ciprés (CTO of DatKnoSys) with more than 10 years hands on experience on data analysis and data mining. During 7 years he has been developing an analytical value-oriented database engine data and developing BI products in Illuminate Solutions S.L. Actually he is working in DatKnoSys, developing technology and DW software.
Topic
HOW TO INTEGRATE BIG DATA ONTO AN ANALYTICAL PORTAL, BIG DATA BENCHMARKING FOR DECISION MAKING
Thursday 15th
13:15 pm to 14:00 pm
Room 25
KeedioSupercomputation Specialist
Alessio Comisso
With a Master in Computational Physics and a PhD in Materials engineering, Alessio Comisso has worked in King's College London as HPC officer for 8 years. His experience includes the design, procurement, deployment and administration of HPC and Big data linux clusters and support of multidisciplinary software. At Keedio he is working at the integration of big data components within the Ambari Keedio Stack and at the development of their vagrant based sandbox: keedio-vagrant.
PivotalSr. Field Engineer
Engineer and Consultant with more than 20 years of experience in different technical and sales roles, mainly IT Consulting and Web/Mobile Application Development, specialized in social aggregation tools, data analytics, gamification and learning systems. I have also developed a number of projects, mainly in mobile and social aggregation areas of expertise.
Topic
BUILDING A REAL-TIME STOCK PREDICTION ENGINE POWERED BY SPRING XD, APACHE GEODE AND SPARK ML
Thursday 15th
17:45 pm to 18:30 pm
Room 19
Hospital Central de la Cruz RojaData Scientist
Javier is a PhD specialized in Geriatrics Medicine. He has worked as a senior doctor for the Public Health System in Madrid (Spain) for the past 25 years. He has participated in multiple research initiatives geared towards improving healthcare delivery for elderly patients.Javier is also a professor at Universidad Alfonso X El Sabio, where he got his Master Degree on Research Methodology on Health Sciences.
Topic
BIG DATA AS A GAME-CHANGER OF CLINICAL RESEARCH STRATEGIES
Thursday 15th
12:15 pm to 13:00 pm
Room 25
ClouderaBig Data security expert
Alex Gonzalez is a Big Data security expert working for Cloudera; he is specialized in Cloduera Navigator Encrypt and Cloudera Navigator KeyTrustee. Has has participated also in the development of Storage Systems such as EMC SANCopy and EMC MirrorView; current efforts are focused on linux security kernel modules development. In 2005 received, at Mexico, the national award of science and Technology.
Topic
SECURITING BIG DATA AT REST WITH ENCRIPTION FOR HADOOP, CASSANDRA AND MONGODB ON RED HAT
Thursday 15th
16:00 pm to 16:45 pm
Room 25
8KdataCEO
Álvaro is a 36 year-old IT entrepreneur, based in Madrid, Spain. Founder and CTO at 8Kdata (www.8kdata.com), a database R&D company, he spends most of his time working on the ToroDB (www.torodb.com) project, the first open source NoSQL-on-SQL database, a MongoDB-compatible database that runs on top of PostgreSQL. He is a passionate software developer and open source advocate. Álvaro is a Java software developer, member of JavaSpecialists.eu, but also a DBA, trainer and frequent speaker at international conferences. He also founded the PostgreSQL Spanish User Group (www.postgrespaña.es), one of the largest PUG in the world, with almost 500 members.
ItainnovaLecturer in Artificial Intelligent and Information
He Received the M. Sc. degree in Physics and the Ph.D. degree in Artificial intelligent at the University of Zaragoza (Spain). Currently he is a project manager and responsible of Information Systems group at the Technological Institute of Aragón. He has participated in several international projects related to Information management and Artificial Intelligent for R&D funded by the European Union in FP5, FP6, FP7 and Eureka program (Celtic) and national programs like Plan Avanza, working as technical director and software engineer. He was also a lecturer of software engineering at the University of Zaragoza; currently he is lecturer of the University of San Jorge about Intelligent Systems and information processing. This summer has participated in the 3rd Artificial General Intelligent course and congress. His research interests include Datamining, Artificial Intelligence, Semantics and Big data in general.
Topic
IMPORTANCE OF COLLABORATION AMONG PROGRAMMERS AND DATA SCIENTISTS
Friday 16th
17:15 pm to 18:00 pm
Room 19
AgoraData Scientist
Mr. Arkadiusz Jachnik is a PhD Student at Institute of Computing Science of Poznan University of Technology (Poland). Until July 2014 he was a Research Assistant at the same Institute. His research interests are machine learning and recommendation systems. Current research activity concerns multi-label classification. Since 2014, he is a Data Scientist at Big Data Department of Agora S.A. He is currently working on real-time user profiling system and highly scalable systems for text clustering and classification.
Topic
REAL-TIME USER PROFILING BASED ON SPARK STREAMING AND HBASE
Thursday 15th
17:00 pm to 17:45 pm
Room 19
IndependentData architecture
Martyn Jones is an internationally recognised data architecture and management professional with a background in risk management; decision support and enterprise performance management. His current interests include Big Data, the evolution of data warehousing and the application of AI techniques to interpreting and understanding data.
Topic
BIG DATA, ANALYTICS AND 4th GENERATION DATA WAREHOUSING
Friday 16th
12:30 pm to 13:15 pm
Room 25
SAPDeveloper
Stephan Kessler
Stephan Kessler is a developer in a Research and Development Team at SAP Walldorf. He is working on the integration of SAPs query execution engines in the Spark eco-system. His main goals are improving the speed of Spark processing even more and bringing new features to the SQL extension. Before joining SAP, he did his PhD and his Diploma (M.Sc.) at the Karlsruhe Institute of Technology at the chair of database and information systems. Before joining the Big Data community his research interest covered privacy in databases as well as sensor networks.
Topic
SAP HANA Vora – COMBINING ENTERPRISE AND HADOOP FOR IN-MEMORY PROCESSING
Thursday 15th
11:30 am to 12:15 pm
Room 25
IBMBig Data Technical Leader
Frank Ketelaars is working as part of a European team focused on IBM Big Data solutions, including Hadoop and Real-time Analytical Processing. In his capacity, Frank leads the European technical community and conducts Big Data architecture sessions with customers and business partners across all industries. Prior to his current role, Frank has fulfilled various national and international assignments, being involved both in pre-sales and consulting assignments focused on application development, data warehousing, database replication and big data. Frank lives in The Netherlands and has 25 years of experience in the information technology area.
NoveltiCEO
Marco Laucelli is a professional with more than 10 years of experience in IT Innovation and New Business development. Currently cofounder and CEO of Novelti, a Spanish startup focused on realtime analytics for IoT networks. Previously to launch Novelti, Marco has been working in IBM for seven year as IT Strategy and Innovation Consultant, being responsible of the IBM Global Entrepreneur program in South Europe. Marco holds a PhD in Theoretical Physics from the University of Oviedo (Spain) and the CERN. Marco has a large technical experience in distributed computing, cloud and data analytics, being actively involved in previous ventures grid computing and in solution design service oriented architectures, data analytics in smart cities for small and large companies.
Topic
UNDERSTANDING THE PHYSICAL WORLD: STREAMING IoT ANALYTICS FOR THE INTERNET OF THINGS
Friday 16th
18:45 pm to 19:30 pm
Room 25
MongoDBTechnical Evangelist
Norberto Leite is Technical Evangelist @ MongoDB. Norberto has been working for the last years on large scalable and distributable application environments, both as advisor and engineer. Prior to MongoDB Norberto served as BigData Engineer at Telefonica.
PiperLabCO-Founder and Data Scientist
Alejandro Llorente es co-fundador y Data Scientist en PiperLab, trabajo que compagina con sus estudios de doctorado en la Universidad Carlos III de Madrid. En el ámbito de la investigación, sus intereses giran en torno al análisis de la movilidad humana en base a las huellas digitales y sus implicaciones en los procesos económicos. Durante los últimos cinco años ha trabajado en la aplicación de técnicas de predicción y modelización de los procesos de comercialización y ciclo de vida del cliente, especialmente en el entorno de las redes sociales y de grandes volúmenes de datos. Alejandro es asimismo profesor en diferentes instituciones como el IE Business School o AFI.
NEMsolutionsData Scientist/engineer
Ion Marqués, PhD, is a Computer Engineer and MsC in Computational Engineering and Intelligent Systems. He got his PhD in Computer Science at University of the Basque Country in 2014. He currently is a Data Scientist and Engineer at NEM Solutions, building Big Data tools and developing AI algorithms for predictive maintenance of complex machines, specifically for rail and energy industries.
University College LondonResearcher
I am a PhD in Computer Science specialised in designing tailored software systems and algorithms that include Machine Learning techniques in different areas like engineering, advertising, … In the past, I have been part of laboratories in different academic institutions such as University of A Coruña, University of Florida and University College London. Currently, I am based in London and share my time between doing post-doc research and lecturing at UCL and tech consulting for different industries and start-ups. I am part of some scientific networks such as the Spanish National Network for Big Data and HPC and the Spanish Association of Artificial Intelligence. I am also a mentor for the UCL’s Computational Finance apprenticeships programme. My main interests are Machine Learning theory and applications, software engineering and specially Big Data technologies and architectures.
StratioCEO and Founding Partner
Oscar Méndez is Co-founder and CEO of Paradigma Tecnólogico and Stratio. Paradigma is an Internet solutions company with clients, mostly IBEX 35 companies, in Spain and in 14 other countries. Stratio is a spin-off of Paradigma using the best of breed of Big Data technologies to cater for clients world-wide. Stratio is based in Palo Alto, CA.
StratioBig Data Developer
Santiago Mola
Santiago Mola is a Big Data Developer at Stratio. He works on projects with Apache Spark Streaming and SQL and is currently helping build the integration of Apache Spark with SAP’s new query execution engine. Santiago has worked previously as a researcher in the Machine Learning field and has contributed to Open Source projects for 9 years.
O'ReillyLeader of the O'Reilly Learning team
An O'Reilly author, Paco is an expert in distributed systems, machine learning, predictive modeling, and cloud computing. He received his BS Math Sciences and MS Computer Science degrees from Stanford, and has 25+ years experience in the tech industry ranging from Bell Labs to early-stage start-ups since receiving his BS Math Sci and MS Comp Sci degrees from Stanford University.
Topics
DATA SCIENCE IN 2016: MOVING UP
Thursday 15th
09:30 am to 10:15 pm
Room 25
CRASH INTRODUCTION TO APACHE SPARK
Friday 16th
12:30 pm to 13:15 pm
Room 19
iTopTrainingResearch and Development Director
Currently working as Research and Development Director for iTopTraining Advanced. ITopTraining is a company that provides technology solutions for the eLearning world. We develop products that make use of machine learning and big data technologies to improve the students learning experience. Our technology ecosystem includes: Java, Javascript, Scala, MongoDB, Cassandra, Neo4j, Amazon AWS, Microsoft Azure, etc.
LinkedInSenior Engineering Manager
Kartik is responsible for the Streams Infrastructure group working on the messaging and event processing infrastructure that powers LinkedIn. As part of this mission, he and his team are focussed on design, development and running LinkedIn's PubSub technology (Apache Kafka), Change propagation pipeline from Databases like Oracle/Espresso (Databus), and Stream Processing Infrastructure (Apache Samza).
Topic
ESSENTIAL INGREDIENTS FOR REAL TIME STREAM PROCESSING @ SCALE
Thursday 15th
10:15 am to 11:00 am
Room 25
StratioBig Data Software Architect
Andrés works for Stratio as Big Data Software Architect, involved in the development of its Big Data platform. He has been using Cassandra since late 2011. He is the main designer and developer of Stratio's Cassandra Lucene Index, a plugin that uses Lucene for extending C* index functionality to provide near real time search such as ElasticSearch or Solr and to speed up Spark jobs. He has also submitted some patches for Apache Cassandra related to secondary index improvement and generalization. He has been speaker at Cassandra Summits and Cassandra Meetups.
Topic
GEOSPATIAL AND BITEMPORAL SEARCH IN C* WITH PLUGGABLE LUCENE INDEX
Thursday 15th
13:15 pm to 14:00 pm
Room 19
Barcelona Supercomputing CenterR&D in Data Performance and Scalability
Dr. Nicolas Poggi, is an IT researcher with focus on performance and scalability of Web and Data intensive applications and infrastructures. He is currently leading a research project on upcoming architectures for Big Data at the Barcelona Supercomputing Center (BSC) and Microsoft Research joint center (http://www.bscmsrc.eu/). Nicolas Poggi received his PhD in 2014 at the Computer Architecture Department (DAC) of the Technical University of Catalonia (UPC - BarcelonaTech), where he also previously obtained his MS degree in 2007. He is part of the High Performance Computing group at DAC and of the Data Centric Computing (DCC) research group at BSC. He has also been a Visiting Research Scholar at IBM Watson in 2012 working in Big Data and system performance topics. Nicolas can usually be found speaking and organizing local IT meetup events in Barcelona, and his contact and publication list at: http://personals.ac.upc.edu/npoggi/.
Topic
AUTOMATING BIG DATA BENCHMARKING AND PERFORMANCE ANALYSIS WITH ALOJA'S OPEN SOURCE TOOLS
Thursday 15th
18:30 pm to 19:15 pm
Room 19
StratioSoftware Engineer
Óscar Puertas
Óscar Puertas is a software engineer at Stratio. He is currently working on the integration between SAP’s new lightweight query execution engine and the Apache Spark cluster computing framework. His main goals are to extend the SQL operations that are going to be pushed down to the data source and add new ones to be handled by Spark. Before joining Stratio, Óscar worked at Buongiorno, Norkom and Fon as a software analyst and developer; he finished his studies at Universidad Politécnica de Madrid, developing his MSc thesis at Joensuun Tiedepuisto, Finland.
Topic
SAP HANA Vora – COMBINING ENTERPRISE AND HADOOP FOR IN-MEMORY PROCESSING
Thursday 15th
11:30 am to 12:15 pm
Room 25
CouchbaseJava Developer
Matthew is the Lead Developer Advocate at Couchbase in EMEA, where he helps to grow the Couchbase community and works with developers to build scalable, low latency back-ends for their software projects.
StratioSoftware Architect
Working as a Software Architect at Stratio, Alberto Rodriguez has been involved in the inception and implementation of some of the Stratio's platform modules including real-time, streaming and ETL tools. Nowadays Alberto is working on Stratio's visualization tool. Alberto is also commiter and PMC member of the Apache Metamodel project.
Topic
GETTING THE BEST INSIGHTS FROM YOUR DATA USING APACHE METAMODEL
Thursday 15th
15:15 pm to 16:00 pm
Room 19
HPHadoop Architect and Software Gardener
Miguel likes to think about the idea that he is a Software Gardener and not a Software Engineer. That conclusion or idea is due to his strong relationship with Agile Development. Miguel declares himself as a Evangelist of Agile Development, and not only in software development, also in business activities; Everything can be Agile if teams are focused in people, client and process. For all of this, he always works hard and combines his skills in Executive MBA Business Administration, Team Development Leader, JEE Software and Hadoop BIG Data Development Architect (Cloudera & Hortonworks).
Topic
ANALYZING ORGANIZATION E-MAILS IN NEAR REAL TIME USING HADOOP ECOSYSTEM TOOLS
Thursday 15th
13:15 pm to 14:00 pm
Room 25
UNIRData Scientist
Rafael has developed his professional career in the Technology industry for the past 11 years. He has taken on roles in the field of research, technology, project management, team leading, business development and middle management. He has worked for multinational firms as Deloitte, Telefonica and Santander, engaging on or leading international initiatives combining technology, management and operations.Rafael currently works with UNIR and Hospital Central de la Cruz Roja in a research initiative in the field of Data Analytics, where SAS, R and Hadoop play a key role.
Topic
BIG DATA AS A GAME-CHANGER OF CLINICAL RESEARCH STRATEGIES
Thursday 15th
12:15 pm to 13:00 pm
Room 25
HPData Scientist Lead and Thinker
Passionate about machine learning and artificial intelligence. His constants are innovation and creativity, which are always present in my solutions. Alberto has dedicated his professional career to make computers learn: During my PhD I had the opportunity to make mobiles identify individuals based on their hand geometry and microcontrollers to detect physiological stress. At present, he is involved in data analysis oriented to business intelligence.
Topic
ANALYZING ORGANIZATION E-MAILS IN NEAR REAL TIME USING HADOOP ECOSYSTEM TOOLS
Thursday 15th
13:15 pm to 14:00 pm
Room 25
MesosphereDistributed Systems Engineer
Jörg Schad was a Ph.D. student at the Saarland University's Computer Science Department. He works on challenges in distributed computing, especially the MapReduce approach and Cloud Computing. Besides he was a member of the Saarbrücken Graduate School of Computer Science. Currently Jörg works as Distributed Systems Engineer at Mesosphere.
Topic
APACHE MESOS AS THE FOUNDATION OF YOUR BIG DATA CLUSTER
Friday 16th
16:15 pm to 17:00 pm
Room 25
ScrapinghubSearch engineer and data scientist
Ex-Yandex engineer, with extensive experience in information retrieval and machine learning problems, building large-scale search engines and advanced data processing systems.
Topic
FRONTERA: OPEN SOURCE LARGE-SCALE WEB CRAWLING FRAMEWORK
Thursday 15th
17:00 pm to 17:45 pm
Room 25
iTopTrainingData Scientist
Currently working as Data Scientist for iTopTraining Advanced. Developing machine learning algorithms involved in the analysis of behavioral user patterns while studying in eLearning platforms as well as working on Natural Language Processing Techniques used to process eLearning Content.
FacebookData Science Leader
Jason is a Quantitative Engineer at Facebook, where he creates visualization applications to yield insights from petabytes of data. Before that, he analyzed and visualized geo data as a Data Scientist at Where.com (acquired by PayPal), and worked on content-based music recommendation at The Echo Nest (acquired by Spotify). He is an avid violinist and chamber musician and cofounder of The Haydn Enthusiasts, a San Francisco collective that is performing the complete String Quartets of Joseph Haydn in monthly installments over a 3-year period. His current extracurricular interests include exploring integrations of his passion for classical music with data visualization.
Data ArtisansCo-founder
Kostas Tzoumas is a committer at Apache Flink and co-founder of data Artisans (data-artisans.com), a Berlin-based company that is developing and contributing to Apache Flink. Before founding data Artisans, Kostas was a postdoctoral researcher at TU Berlin, received a PhD in Computer Science from Aalborg University and has been with the University of Maryland and Microsoft Research in Redmond in the course of several internships.
Topic
APACHE FLINK: DATA STREAMING AS A BASIS FOR ALL ANALYTICS
Friday 16th
15:30 pm to 16:15 pm
Room 25
GoogleLead PM for Big Data, Google Cloud Platform
William Lead PM for Big Data services on Google Cloud Platform. He manages a team of Product Managers to define and deliver the industry's best data processing and management platform.
Technological Institute of AragonBig Data Specialist
Jorge Vea Murguía
Jorge Vea Murguía is a computer engineer. He is working as a engineer at Technological Institute of Aragon in Logisitcs and ICT division within the Information Systems group. He has extensive experience in issues related to distributed systems, information systems and software process engineering. He has actively cooperated and participated in several international projects for R&D funded by the European Union and national programs within the Plan PROFIT and Avanza.
Topic
IMPORTANCE OF COLLABORATION AMONG PROGRAMMERS AND DATA SCIENTISTS
Friday 16th
17:15 pm to 18:00 pm
Room 19
ZalandoData Science Expert
Roland obtained his PhD at the Technische Universität Berlin. He worked as Head of Research at GA Financial Solutions GmbH, and currently he is Data Science Expert at Zalando. Roland is interested on Warehouse Logistics and Machine Learning.
Topic
LARGE SCALE BAYESIAN INFERENCE OF RETAIL ARTICLE WEIGHTS
Thursday 15th
11:30 am to 12:15 pm
Room 19
Neo TechnologyChief Scientist
CoAuthor of the book "Graph Databases", Jim Webber is Chief Scientist with Neo Technology the company behind the popular open source graph database Neo4j. Jim is intimate with things that can't be practically done without a graph database.