Designation – Big Data Analyst
Location – Mumbai
About employer– Machinepulse
Responsibilities
- Entire data analysis preparation stage: model design, feature planning, system infrastructure, production setup and monitoring, and release management.
- Implement the complete batch analytics for time series data using hadoop ecosystem tools.
- ETL on large scale data sets which are stored as part of non-relational database/Distributed File Systems using Map/Reduce.
- Perform large scale data aggregation on the time series data on hourly, daily, weekly, monthly, quarterly and yearly.
- Prepare data sets as per the requirement defined by the machine learning team to derive actionable insights.
- Implement the data marts for different business needs on the distributed file systems.
- Develop the scripts as and where required to aggregate the data by developing the User Defined Functions (UDF) using Hive/Pig/Scalding.
- Create the analytics database as part of the data processing on the Distributed File System.
- Implement the big data lambda architecture to merge the batch results and real time results to render the same in the dashboard for visualization and persistence.
- Evaluate various big data open source frameworks as and when required by developing the Proof-of-Concepts (PoC’s) and Proof-of-Values (PoV’s).
- Test the developed scripts on distributed and non-distributed environments in the cloud.
Qualification and Skills Required
- BTech/BE but will consider MCA in Computer Science or related field.
- Familiarity with distributed systems and methodologies: Hadoop, Map/Reduce, Hive, Pig, Scalding.
- Experience with at least one NoSQL database: MongoDB, HBase, and Cassandra.
- Expert in at least one programming language: Java, Scala, Python.
- Familiarity with java build tools: Maven, Ant.
- Familiarity with any versioning tools: Bitbucket, gitLab , SVN.
- Good understanding of UNIX / LINUX platforms.
- 2-3 years of work experience.
- Experience with any cloud environments: AWS, Rackspace, CtrlS.
- Experience with distributed system development, deployment and maintenance.
- Experience with at least one business intelligence tools: Tableau, Pentaho, Qlikview.
- Must have a strong inclination towards mathematics and statistics
Interested people can apply for this job can mail their CV to [email protected] with subject as Big Data Analyst – Machinepulse – Mumbai