
Rimoldini, L., et al.: A&A 674, A14 (2023)
score. Selection procedures, parameter distributions, and assess-
ments of candidates are presented for each class.
Some of the classification results are further processed
by specific object studies (SOSs) dedicated to single classes,
typically describing a subset of the most reliable candidates in
detail. Such single-class processing modules are available in
DR3 for active galactic nuclei (AGNs Carnerero et al. 2023),
Cepheids (Ripepi et al. 2023), compact companions (Gomel et al.
2023), eclipsing binaries (Mowlavi et al. 2023), long-period
variables (Lebzelter et al. 2023), main-sequence oscillators
(Gaia Collaboration 2023b), planetary transits (Panahi et al.
2022), and RR Lyrae stars (Clementini et al. 2023). Other SOS
modules, such as microlensing events (Wyrzykowski et al. 2023),
short-timescale variables (see Sect. 10.12 of the Gaia DR3
documentation; Rimoldini et al. 2022), and solar-like rotation
modulation stars (Distefano et al. 2023), were executed indepen-
dently of the classification results, as they relied on their own
candidate selection. A summary of the variability results from all
modules is presented in Eyer et al. (2023).
This article is organised as follows. The classification input
data are outlined in Sect. 2; the preparation, application, and veri-
fication ofsupervised learning procedures are described in Sect. 3;
the results for each class are presented in Sect. 4; and conclusions
are drawn in Sect. 5. Special training selections applied to a sub-
set of classes are detailed in Appendix A; selected classification
attributes are listed in Appendix B; additional class labels from
the literature (among the false positive classes listed in Table 3)
are defined in Appendix C; some examples of queries to facilitate
the exploitation of classification results in the Gaia archive
2
are
provided in Appendix D; and common diagrams for all classes,
including a summary of trained and classified sources, an assess-
ment of the results with respect to the literature, and sample light
curves, are presented in Appendix E. All table names in the Gaia
archive that are mentioned in the text assume the prefix gaiadr3
(as shown in Appendix D).
2. Data
As part of the Gaia variability pipeline (Eyer et al. 2023), the gen-
eral classification module received – as input – sources with pho-
tometric time series in the G, G
BP
, and G
RP
bands (Riello et al.
2021) that had at least five field-of-view (FoV) measurements
in the G band, which were already identified as potential vari-
able sources and characterised by basic statistics and periodic-
ity parameters. Before any computation, sources and associated
epoch FoV transits were processed by the chain of operators
described in Sect. 10.2.3 of the Gaia DR3 documentation
(Rimoldini et al. 2022) and Sect. 3.1 of Eyer et al. (2023), which
selected, transformed, and cleaned time series from spurious or
doubtful observations. The balance between outlier removal and
signal preservation favoured the latter, considering that some of
the targeted variability types relied on a small number of outlier-
like measurements (such as Algol-type eclipsing binaries and
microlensing events). All time series and derived statistical num-
bers hereafter refer to these cleaned time series. The median
number of FoV measurements in the three photometric bands is
between 40 and 44 per source (Eyer et al. 2023), within a time
span of typically 900–1000 days in the G band.
While the processing of Gaia (early) DR3 photometry
included significant calibration improvements with respect to
DR2 (Riello et al. 2021), some low-level uncalibrated system-
atic effects remained and their impact on epoch photometry are
2
https://gea.esac.esa.int/archive/
described in Evans et al. (2023). Among instrumental effects,
scan-angle dependent signals were induced mainly by asym-
metric extended sources (such as barred spiral galaxies and
tidally distorted stars) and multiple close pairs (.1
00
) of point-
like sources (Holl et al. 2023). Although such signals helped the
identification of galaxies from photometric variations, in general
data artefacts might interfere with the correct identification of
classes with genuine variability, especially those associated with
low signal-to-noise ratios.
The classification of variables employed also astromet-
rically derived parameters such as parallax and proper
motion (Lindegren et al. 2021b). However, Gaia DR3 astro-
physical parameters (Andrae et al. 2023; Creevey et al. 2023;
Delchambre et al. 2023; Fouesneau et al. 2023) could not be
included as they were processed in parallel and became avail-
able after the results of the variability pipeline were finalised.
A subset of classified sources were analysed in more detail
by subsequent SOS modules, typically focusing on specific
classes, as mentioned in Sect. 1. The results of all variability
modules were subject to additional source filtering before their
ingestion into the public Gaia archive (Babusiaux et al. 2023).
Statistical parameters of all the photometric time series pub-
lished in Gaia DR3 are available in the vari_summary table.
3. Method
For Gaia DR3, general classification relied on supervised
machine learning, that is, training classifiers with sources of
known variability types and applying the resulting models to
classify sources of an unknown variability type. Known vari-
ables in the literature are cross-matched with Gaia sources,
verified, selected, and characterised by attributes derived from
the Gaia data. The use of both cross-match sources and (opti-
mised) classification attributes for training was described in
Rimoldini et al. (2019) and it is not repeated herein.
An extensive cross-match of Gaia sources was compiled by
Gavras et al. (2023), which provided millions of variable objects
from the literature and represented over 100 variability types. The
robustness of the cross-match method, which included astromet-
ric and photometric information in the identification of matches,
and the verification of the genuineness of literature classifi-
cations ensured the reliability of training sources (critical to
supervised classification) and of the validation of the results.
3.1. Training set
Potential training sources from literature were vetted for each
class to ensure the correct class membership. This was repeated
for every catalogue that was deemed trustable for training the
class under investigation. The reliance of supervised classifica-
tion on known objects makes it vulnerable to biases from the
literature, for instance, related to their data acquisition and clas-
sification methods. Thus, in addition to class verification, the
cross-matched objects were probed in several dimensions to
identify intrinsic biases, such as limited sky coverage or appar-
ent magnitude range with respect to the ones of Gaia, in order
to prevent (or minimise) the transfer of literature selection func-
tions to the Gaia classifications.
3.1.1. Published classes
Since it was difficult to know a priori the full list of classes
that could be identified in Gaia DR3, the verification of liter-
ature classifications and source selection for training purposes
A14, page 2 of 105
评论