What You'll Do / Responsibilities
Tech stack
ML: Pytorch, Python
DB: Neo4jDB, MongoDB, InfluxDB, RabbitMQ, PostgreSQL, Weaviate
Docker/Kubernetes
Main fields of development are :
- Classify posts and accounts from 12 sm/fintech platforms covered in order to get nuanced sentiment (e.g. bullish bots or long-short FOMO) towards more than 20k assets (stocks, etf, crypto)
- Classify accounts by their professionalism and stance to certain market strategies
Required Qualifications
Gradient boost (XGB, LGBoost, Catboost) based on text/image features to classify retail investor profiles and posts
NLP models:
- HuggingFace Transformers (currently MT5, T5, XLM-roberta, for Weavitae-db embedding representation we use Sentence-based Transformer models)
NLP preprocessing
- Be able to come up with quick embeddings realization
- Frameworks like Rubrix for labeling data is a great plus
- Keyword extraction with KeyBert, yake, multi-rake, summa - needs understanding of both tf-idf and deep models
Active participation in company life (e.g. new ideas, pet projects, opensource) is much appreciated
Understanding of Chinese social media platforms, especially: WeChat, Weibo, Zhihu
Preferred Qualifications
Highly appreciated but not crucial:
- GAN’s
- Adversarial attacks (poison and evasion on text and graphs)
- PyTorch Serve
Benefits / Compensation
Competitive salary and performance-oriented incentives
Flexible and creative working environment
40-hour work week