Our client is the world’s largest online community, uniting students, parents and teachers in solving their academic problems and exchanging knowledge. Every month they are proud to be home for 350 million users. Besides delivering world class educational products they’re a technology company at heart. They want to be the best educational platform in the world.
NOTICE: ONLINE RECRUITMENT PROCESS
LOCATION: Kraków/Barcelona or remotely from Poland/Spain
BUDGET: Senior: 26 000 – 35 000 PLN gross/monthly, Lead: 35 000 – 40 000 PLN gross/monthly
The Content Quality Systems team is responsible for providing the technical capabilities of:
- Learning new knowledge representations for text and media-rich contents.
- Enhancing and automating the content management process.
- Ensuring the quality and safety of the company’s knowledge base.
On a Senior role where you will research & develop the data science capabilities of their knowledge base and state-of-the-art NLP algorithms on top of the PyData stack, HuggingFace, PyTorch, Tensorflow, Sagemaker and other AWS cloud services.
As a Tech Lead role you will act as technical partner to the project manager. You will lead a team of world-class scientists and engineers driving the data science capabilities of their knowledge base and state-of-the-art NLP algorithms on top of the PyData stack, HuggingFace, PyTorch, Tensorflow, Sagemaker and other AWS cloud services.
The ideal candidate is an enthusiast of educational technologies with a strong theoretical background and a blend of coding, machine learning, and statistics skillset.
The position is based in one of the offices located in Krakow or Barcelona, or remotely from Poland or Spain.
WHAT WILL YOU DO?
- Research and implement novel techniques of language understanding and knowledge representation.
- Explore and analyze multi-dimensional and unstructured data in order to find patterns and relationships.
- Develop probabilistic models of linguistics and social interactions sitting on top of their machine learning stack.
- Work closely with the engineers in order to integrate models and algorithms with the larger system.
- Effectively communicate results and storytelling with the data.
- Ensure statistical and scientific rigor among the whole team.
- Work closely with the Director of AI and product teams to design new features or improve the functionalities and user experience of the platform.
- Educate company’s employees on the topics related to Machine Learning.
WHAT WILL YOU NEED TO BE SUCCESSFUL IN THIS ROLE?
- Master’s degree or above in STEM fields (science, technology, engineering, or mathematics) or a similarly quantitative field.
- 4+ years experience, or a comparable industry career, with machine learning, data mining, or statistical modeling.
- Working knowledge of Python and the PyData stack or other numerical programming languages.
- Experience with analyzing massive datasets on a distributed cluster of machines (map/reduce or other parallel computing frameworks).
- PhD with publications in Statistics, Data Science, Computer Science, Machine Learning, Mathematics or similar.
- Experience with NLP or other deep learning models in production.
- Experience with large volume ETL jobs or data streaming.
- Experience with graphs data and algorithms.
- Contributor or owner of GitHub repositories.
- Ability to prototype and test suboptimal solutions quickly and iterate up to a final product that can be deployed in production.
- A scientific mentality with the ability to ask the right questions, as well as answer them.
- Ability to convey complex analyses with the most efficient and intuitive visual interactions and data storytelling.
- Ability to break research down into clearly defined tasks and quick iterations.
- Team player attitude and clear communications skills.
- Familiar with agile development and lean principles.
Skills and systems
- Strong theoretical background on at least a few among high-dimensional classifiers, regression models, clustering algorithms, recommender systems, time-series analysis, Bayesian inference, text analytics, knowledge graphs, representation learning (embeddings), computer vision, or social network analysis.
- At least some of the data analysis and visualization tools such as pandas, dask, vaex, matplotlib, seaborn, plotly, dash, bokeh, shap, streamlit.
- At least some of the data engineering technologies such as Spark, DataBricks, Glue, EMR, Docker, Kubernetes, SQL, key-value stores, Redshift, Snowflake.
- At least some of the ML technologies such as AWS SageMaker, Tensorflow Extended, PyTorch, Spark ML, scikit-learn, XGBoost, KubeFlow, MLFlow, or related frameworks.
- Fluent in English.
- Start date: as soon as possible (however, they’re happy to wait for the right person)
- Some of thier benefits (the final package will depend on the location):
- Flexible working hours and the possibility to work remotely
- Personal development budget 800$ per year + unlimited time off policy for participation in conferences and workshops and access to an online learning platform with courses from Udemy, Harvard Manage Mentor and many others
- Fully paid private health care packages for you and your family (dental care included) provided by Luxmed
- Fully paid life insurance provided by Warta
- Multisport Plus card
- Access to the Mental Health Helpline – providing virtual support of external psychologists, psychotherapists, and coaches
- AskHenry services – personal concierge services to help you to settle your everyday matters (like Ikea shopping or shoemaker visit)
- Possibility to join one of their Employee Resource Groups and initiatives
- If needed, additional budget for work remote work accessories
Note: Prepare your CV in English (PDF), fill in the form, and apply!
Please include in your CV the following clause necessary for the recruitment process:
“I agree to the processing of personal data that I have made available voluntarily in the recruitment process by the Administrator of personal data, i.e. Dotcommunity Spółka z ograniczoną odpowiedzialnością [Ltd.] based in Cracow, 15 Żabiniec Street, 31-215 Cracow, registered in Poland, the Cracow’s District Court – Śródmieście, XI Commercial Division of the National Court Register under number 0000468484, VAT number: 9452174499, (“Dotcommunity”) in order to carry out the recruitment process for the Senior/Lead Data Scientist (with NLP) on the basis of Art.6 item 1a of the Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation)”