Jet flavour tagging is crucial in experimental high-energy physics. A tagging algorithm, DeepJet- Transformer, is presented, which exploits a transformer-based neural network that is substantially faster to train than state-of-the-art graph neural networks. The DeepJetTransformer algorithm uses information from particle flow-style objects and secondary vertex reconstruction for b- and c-jet identification, supplemented by additional information that is not always included in tagging algorithms at the LHC, such as reconstructed K0S and Λ0 and K±/π± discrimination. The model is trained as a multiclassifier to identify all quark flavours separately and performs excellently in identifying b- and c-jets. An s-tagging efficiency of 40% can be achieved with a 10% ud-jet background efficiency. The performance improvement achieved by including K0S and Λ0 and K±/π± discrimination is presented. The algorithm is applied on exclusive Z → qq¯ samples to examine the physics potential and is shown to isolate Z → ss¯ events. Assuming all non-Z → qq¯ backgrounds can be efficiently rejected, a 5σ discovery significance for Z → ss¯ can be achieved with an integrated luminosity of 60 nb−1 of e+e− collisions at √s = 91.2 GeV, corresponding to less than a second of the FCC-ee run plan at the Z boson resonance.