Description
About the organization:
Einstein products & platform democratize AI and transform the way Salesforce builds trusted machine learning and AI products - in days instead of months. It augments the Salesforce Platform with the ability to easily create, deploy, and manage Generative AI and Predictive AI applications across all clouds using Agentforce platform. We achieve this vision by providing unified, configuration-driven, and fully orchestrated machine learning APIs, customer-facing declarative interfaces and various microservices for the entire machine learning lifecycle including Data, Training, Predictions/scoring, Orchestration, Model Management, Model Storage, Experimentation etc.
We are already producing over a billion predictions per day, Training 1000s of models per day along with 10s of different Large Language models, serving thousands of customers. We are enabling customers' usage of leading large language models (LLMs), both internally and externally developed, so they can leverage it in their Salesforce use cases. Along with the power of Data Cloud, this platform provides customers an unparalleled advantage for quickly integrating AI in their applications and processes.
About the Team:
Join the AI Cloud Infra Engineering team, and become a specialist on Salesforce's AI Platform and Agentforce! You'll get to work with latest technology in the AI Infrastructure space, and collaborate with the team and cloud to identify and solve infrastructure challenges at massive scale planned for this year. We are a diverse team of curious minds that specialize in distributed systems, cloud based infrastructure, continuous delivery, security research, and innovative tool development. We evaluate a broad range of technologies including distributed processing, virtualized environments, micro-services and automated tools. Outside of work, we also focus on volunteering and live the 1:1:1 model!
We are looking for Engineering leaders to help us take us to the next level, and build an infrastructure platform that can host and scale to hundreds of thousands of customers, and hundreds of billions of predictions per day and works on bleeding edge technologies on model training, model inferencing and Generative AI.
Responsibilities:
* Drive the execution and delivery of features by collaborating with many cross functional teams, architects, product owners and engineers
* Make critical decisions that attribute to the success of the product
* Proactive in foreseeing issues and resolve it before it happens
* Daily management of standups as the ScrumMaster for engineering teams
* Partner with the program team to align with objectives, priorities , tradeoffs and risk
* Ensuring the team has clear priorities and adequate resources
* Empowering the delivery team to self organize
* Be a multiplier and have a passion for team and team members’ success
* Providing technical guidance, career development, and mentoring to team members
* Maintaining high morale and motivating the delivery team to go above and beyond
* Vocally advocating for technical excellence and helping the teams make good decisions
* Participating in architecture discussions and planning
* Participating in cross-functional coordination, planning, and reviews with leads from other engineering teams
* Maintaining and fostering our culture by interviewing and hiring only the most qualified individuals
* Be passionate about automation and to avoid doing things manually
* Occasionally contributing to development tasks such as scripting and feature verifications to assist teams with release commitments, to gain an understanding of the deeply technical product as well as to keep your technical acumen sharp
Required Skills:
* Masters / Bachelors degree required in Computer Science, Software Engineering, or equivalent experience
* 5+ years experience leading software, DevOps or system engineering teams with a distinguished track record on technically demanding projects
* Strong verbal and written communication skills, organizational and time management skills
* Ability to be nimble, proactive, comfortable working with minimal specifications
* Experience in hiring, mentoring and managing engineers
* Championing a culture and work environment that promotes diversity and inclusion.
* Working experience of software engineering best practices including coding standards, code reviews, CI, build processes, testing, and operations
* Experience with AI technology stack like sagemaker, bedrock or similar other LLMs hosting.
* Experience with Agile development methodologies. ScrumMaster experience required
* Experience in communicating with users, other technical teams, and product management to understand requirements, describe product features, and technical designs
* Prior experience in any of the following languages: Go, Python, Ruby, Java
* Experience working with source control, continuous integration, and testing pipelines
* Experience building large scale distributed, fault-tolerant systems
* Experience with container orchestration systems such as Kubernetes, Docker, Helios, Fleet
* Public cloud engineering on AWS (Amazon Web Services), GCP (Google Cloud Platform), or Azure platforms
* Experience in configuration management technologies such as Chef, Puppet, Ansible, Terraform
Preferred Skills:
* Masters Degree in Computer Science
* Experience with building large scale Search cluster using a technology like Elastic Search
* Understanding of fundamental network technologies like DNS, Load Balancing, SSL, TCP/IP, SQL, HTTP
* Understand cloud security and best practices