Louai Alarabi holds a Ph.D. degree from the Computer Science and Engineering Department at the University of Minnesota Twin Cities. He is currently working as a chief Data Analytics Engineer at SITE company. He worked as an assistant professor at UMM-ALQURA University in the Department of Computer Science. He is a member of the Data Management Lab supervised by Prof. Mohamed Mokbel. He received his B.SC at Umm Al-Qura University in Computer Science. His research interest lies in the board area of databases with the focus on big data management and spatio-temporal computing. Louai enjoys building systems; he invented and developed several Big Data systems, including Data Lake for General Authority of Statistics, ST-Hadoop: a comprehensive distributed spatio-temporal data management system, TAREEG: a distributed MapReduce framework for extracting a spatial feature from the map, MNTG: Traffic generator, SHAREK: A ride sharing system , and TAGHREED: A microblogs Management system. His research was recognized by the first place and a gold medal award in student research competition at ACM SIGSPATIAL/GIS 2018, among best paper award at SSTD 2017, Finalist of student research competition at ACM SIGMOD 2017, and best demonstration award at U-Spatial Symposium 2014. His research was funded by a collaboration with UMM-AlQURA University, KACST GIS Technology Innovation Center, King Abdulaziz City for Science and Technology (KACST), and the University of Minnesota. Prior role or while on leave from UQU University including Director of Statistical Database and Data Warehouse at General Authority of Statistics, Riyadh, KSA. Big Data Consultant at General Authority of Statistics, Riyadh, KSA. Software Engineer at Advanced Electronics Company in Riyadh, KSA. Research Assistant at the KACST GIS Innovation Center, KSA. Teaching Assistant at the University of Minnesota, USA.
General Authority for Statistics (GASTAT) is keen on embarking on a Big Data Analytics journey and becoming one of the leading modern statistical institutes worldwide in terms of innovation, time to market, user delight, and service cost, despite its vast size and scale. The current GASTAT model of data processing focuses on reporting needs and captures data provided from external entities and surveys. GASTAT needs to conduct periodic censuses on population, housing, agriculture, fisheries, poultry, business, industry, and other sectors of the economy, creating massive amounts of data. Such data must be collected, compiled, analyzed, and published on statistical platforms. The ability to handle complex data from semi-structured and unstructured data sources and conventional structural data processing maximizes the value potential and quality of services in operational efficiency in the digital age. Managing terabytes of data efficiently, GASTAT has the potential to be a trendsetter for smart statistical data empowering representatives of GASTAT at every level with analytics, thus standardizing and speeding up the decision-making process.
Responsibilities included : Preparing and discussing homework exercises and programming assignments, delivering lab tutorials and recitations, and grading exams and quizzes.
Courses: CSCI-4061:Operating System, and CSCI-4707:Practice of Database Systems
Member of the Data Management group doing research in microblog data management and big spatio-temporal data processing, building these systems as a proof of concept using and extending SpatialHadoop to support processing and indexing the temporal dimension, document the outputs as research papers, and giving demonstrations for business purposes.
Responsibilities included : Preparing and discussing homework exercises and programming assignments, delivering lab tutorials and recitations, and grading exams and quizzes, finally publishing a free online tutorial of Clips an artificial intelligent language for students.
Courses: Computer Graphics, Advanced Programming Languages, Structured Programming Languages, Expert System , and Software Engineering.
Member of a research and development team mainly working on implementing a DLMS protocol on ARM Processor. Also, developed Electronic Gateway desktop application called Parameterization Software PS, where it has designed and developed as a second generation of Digital Meters for Saudi Electrical Company under the Authority of AEC.
GPA: 3.604
GPA: 3.515
GPA: 3.56
Journals
Conference Proceeding
Conference Demonstration
Workshop & Abstract
Newsletter
ST-Hadoop is a MapReduce framework that acknowledges the fact that space and time play a crucial role in query processing. ST-Hadoop is an open-source extension of a Hadoop framework that injects the spatiotemporal awareness in the code base of four layers inside SpatialHadoop, namely, language, indexing, MapReduce, and operations layers. The spatio-temporal indexing techniques inside ST-Hadoop primarily tuned to provide the accommodation of new updated dataset efficiently without the need to rebuild its index. The key point behind the performance gain of ST-Hadoop is the idea of indexing, where data are temporary loaded and divided across computation nodes. For more information, please visit: http://st-hadoop.cs.umn.edu.
TAREEG is a MapReduce-Based System for Extracting Spatial Data from OpenStreetMap Real spatial data, e.g., detailed road networks, rivers, buildings, parks, are not really available in most of the world. This hinders the practicality of many research ideas that need a real spatial data for testing experiments. Such data is often available for governmental use, or at major software companies, but it is prohibitively expensive to build or buy for academia or individual researchers. TAREEG; a web-service that makes real spatial data, from anywhere in the world, available at the fingertips of every researcher or individual. TAREEG gets all its data by leveraging the richness of OpenStreetMap dataset; the most comprehensive available spatial data of the world. Yet, it is still challenging to obtain OpenStreetMap data due to the size limitations, special data format, and the noisy nature of spatial data. TAREEG employs MapReduce-based techniques to make it efficient and easy to extract OpenStreetMap data in a standard form with minimal effort. TAREEG is accessible via www.tareeg.org.
MinnesotaTG is a project developed at the University of Minnesota. MinnesotaTG is built based on two existing traffic generators: (1) BerlinMod and (2) Thomas-Brinkhoff. The purpose of MinnesotaTG is to take an arbitrary region in the United States and generate traffic data from that region. Without this tool, generating this traffic is a complicated and drawn out process because of the number of configuration steps necessary to get either Thomas-Brinkhoff or BerlinMod both up and running, and able to work on a user specified region. The generation of the traffic is not done by the tool itself, but rather it is performed by these two different traffic generators. For more information, please visit: http://mntg.cs.umn.edu/.
Taghreed is a system for querying, analyzing, and visualizing geotagged tweets. Taghreed is the first to manage both recent and historical Twitter data. It digests incoming fast data in real time and scale for billions of historical data items. On both, scalable spatio-temporal keyword queries are supported to facilitate efficient and scalable spatio-temporal analysis and visualization of Twitter data. Taghreed is powering two startups serving social media analysis services for Middle Eastern customers at Wadi Makkah innovation incubator. please visit: http://www.gistic.org/taghreednew/.
Undergraduate Studetns
Grid Index in Crowdsourcing (2020-2021): The area of our research in computer science, particularly spatial computing. Spatial computing provides a very important computing technology for any geographically related applications and services. It helps in the construction of geographical websites' profiles and services. Spatial computing can be used in location identification such as google maps and have other uses and services in spatial planning, traffic congestion control, evacuation systems, crowd management, and spatial crowdsourcing. In this research, we have investigated two main components of spatial crowdsourcing. In particular, we introduced spatial indexes to a crowdsourcing platform. Next, we constructed the basic spatial operation of crowdsourcing applications. We conducted an experimental study to recommend a suitable big data framework and suitable spatial indexing for crowdsourcing applications.
Conference Paper.
Team:
shahd mosa qari,
Manal Abdulrahman Alharthi,
Leena Solayman Houmaidan,
raghad mohammed alqurashi, and
Ashwaq Ahmed Almalki.
ScanMarket (2019-2020): With the technical developments that have taken place or will happen in Saudi Arabia with Vision 2030, it is vital For retailers keeping up with this development. One of the challenges retailers might face the need for fast and easy shopping systems. In ScanMarket, we propose and develop a system that will assist in saving time and effort and helps retailers to satisfy consumers and reduce work pressure on the staff and organizations.
Demo.
Poster.
Team:
Bashayer Ali Youseff,
Fatmah ahmed sidebaba,
BGhofran khalid doman,
Kholod hussin al-sehli.
Imprint (2019-2020): The Imprint is a web designed system for querying, analyzing microblogs. Twitter is a form of microblogs with a stream of rich data that carry different types of information. The Imprint system has three basic operations: locate hashtag, display trend growth, and cluster account by their area of interest. This project locates topics, specifies the original location and source tweets topics, and displays trend evolution that describes how trend spreads across the country. Imprint cluster twitters users by their interest using various text classification techniques. Imprint system is connected with a nicely designed map interface that allows for interactive visualization.
Demo.
Poster.
Team:
Atheer Alhazmi,
Elham Alasmri,
Hadeel Alraddadi,
Layan Alahmadi,
Shahd Alsayed.
CMM (2019-2020):The CMM: Crowds in Mecca and Madina. Crowd analysis always has motivated scientists, authorities, and agencies. The CMM project has the primary goal of generating a simulation of people movement in two specific areas of AL-Haram in Mecca and Al-Haram in Al-Madina.
Demo.
Poster.
Team:
Amal Abdo Alabsi,
Manar Mokhtar Rabee,
juman adnan alfhmi,
Yousra abdullah felemban,
Weaam Wael Khayyat.