Skip to main content

C Crew FAQ

What does c crew do?

  • Handle alerts
  • Handle ticket issues
  • Follow-up tickets or issues assigned to other members of the team (the c-crew is the owner of all tickets and issues)
  • Handle customer messages

How to handle context/handoff?

  • Write a follow-up note (on-call note or internal c crew channel)
  • Tag notes with specific customer/cluster (make them searchable)
  • Log changes
  • Log any out-of-the-ordinary issues

What can be escalated?

  • Important customer
  • Broad/impactful issue
  • Specific context

What to do as the primary c crew?

  • Actively check tickets and issues
  • Follow up on tickets and issues assigned to other team members
  • Assign tickets to secondary (being the primary does not mean the primary has to handle all or most of the tickets)
  • Handle maintenance if needed

What to do as the secondary c crew?

  • Help primary in issues (being the secondary does not mean the secondary has to handle all or most of the tickets)
  • Be available in case the primary needs help and offer help
  • Help in maintenances

How to communicate with other teams?

Use the team’s contact point or contact channel:

  • gherghi-bugs for Gherghi
  • monitoring-oncall and logging-oncall for Maratus
  • mehdirhms or support_platform for support platform issues

How to measure?

  • Issue count?
  • Escalation count
  • List of unsolved/unknown bugs?

How to improve?

  • Reduce escalations
  • Reduce issues
  • Reduce alerts
  • Reduce time to ack
  • Reduce time to resolution

What to do in spare time?

  • Reduce alerts and root causes
  • Try to “empty” the alertmanager of all current alerts
  • Add appropriate alerts, dashboards or metrics (increase observability)
  • Write automated tools or playbooks
  • Reduce custom context across customers (if possible)
  • Reproduce unsolved bugs
  • Reduce “hacky” fixes
  • Find a metric or some sort of observation for current bugs
  • Identify and log parts of the operation that aren’t mainstream yet
  • Check and resolve issues in jira with the label crew/customer

How to handle issues?

  • Mostly using already automated tools or playbooks
  • Documents or handbooks not yet automated
  • From previous notes

What should the C-crew know?

  • Basic “how to use” knowledge of most components in the company
  • Deeper knowledge of our components
  • Different component ownerships (both inside the team and inter-team)
  • Different teams’ contact points
  • Different customers’ account managers
  • Any changes in our components

Where to check / What issues to keep track of?