Options

This page details all the available options. This information is also available on the command line by invoking goquartical or goquartical help. The following is broken up by configuration section.

Note

These options can be specified either via command line or .yaml file e.g.:

input_ms.path=path/to.ms

or

input_ms:
    path: path/to.ms

input_ms

Options pertaining to the input measurement set.

input_ms.path

Path to input measurement set.

Default: N/A

input_ms.data_column

Name of column to use as data.

Default: DATA

input_ms.sigma_column

When given, the weights will be reinitialized as 1/sigma_column**2. This should be preferred over weight_column to ensure that the correct data weights are used and not those from a previous calibration run. Mutually exclusive with input_ms.weight_column.

Default: N/A

input_ms.weight_column

Column to read weights from. An empty string will result in all weights being treated as unity if sigma_column is not set. Mutually exclusive with input_ms.sigma_column (which is preferred).

Default: N/A

input_ms.time_chunk

Chunk data up by this number of timeslots. This limits the amount of data processed at once. Smaller chunks allow for a smaller RAM footprint and greater parallelism, but this sets an upper limit on the solution intervals that may be employed. Specify as an integer number of timeslots, or a value with a unit (e.g. ‘300s’). 0 means use full time axis.

Default: 0

input_ms.freq_chunk

Chunk data by this number of channels. See time_chunk for info. Specify as an integer number of channels, or a value with a unit (e.g. ‘128MHz’). 0 means use full frequency axis.

Default: 0

input_ms.is_bda

If set True, the input measurement set is assumed to have been averaged in a baseline dependent fashion.

Default: False

input_ms.group_by

Input data will be partitioned into separate xarray datasets based on the values of the specified columns. Multiple column names may be given as a list, e.g. [SCAN_NUMBER, FIELD_ID, DATA_DESC_ID].

Choices:

SCAN_NUMBER

FIELD_ID

DATA_DESC_ID

Default: [‘SCAN_NUMBER’, ‘FIELD_ID’, ‘DATA_DESC_ID’]

input_ms.select_corr

Select correlations from the input data. These are specified as integer values and must be given as a list e.g. to select the first and last correlations in a measurement set with four correlations, use [0, 3].

Choices:

0

1

2

3

Default: N/A

input_ms.select_fields

Select fields from the input data. These are specified as integer values and must be given as a list e.g. to select fields 2 and 6 use [2, 6].

Default: []

input_ms.select_ddids

Select data descriptor IDs (spectral windows) from the input data. These are specified as integer values and must be given as a list e.g. to select ddids/SPWs 0 and 2, use [0, 2].

Default: []

input_ms.select_uv_range

Select a range of uv values to include when calibrating. Practically, this treats points outside the range as having zero weight. Use zero to indicate an open interval, e.g. [100, 0] would select all values greater than 100, [0, 10000] would select all values less than 10000 and [100, 10000] would select values greater than 100 but less than 10000.

Default: [0, 0]

input_model

Options pertaining to the input model.

input_model.recipe

Input model recipe. This is a string which describes how the model data should be constructed. Currently, measurement set columns and Tigger lsms are supported. Multiple columns and lsms can be specified and combined using colons (:), addition (+) and negation (~). As an example consider COL1, COL2 and LSM1. “COL1” will simply use a single MS column as input. “COL1:COL2” will create a direction dependent model with COL1 as the first direction and COL2 as the second. “COL1~COL2” will create a model with a single direction by subtracting COL2 from COL1. “COL1+COL2” will create a model with a single direction by adding COL1 and COL2. Tigger lsms can be used interchangeably with columns (using the same syntax) but also support the use of tagged directions. “LSM1” will create a model with a single direction from the sky model. “LSM1@dE” will create a model with multiple directions, one for each cluster tagged with dE in the lsm. Combining all of the above it is possible to do things like “COL1~LSM1:LSM1@dE” which will create a model with multiple directions, the first given by COL1 minus LSM1 and the remainder given given by the dE tagged clusters in LSM1. Leaving this value unset (the default) will use an identity model.

Default: N/A

input_model.beam

Path to beams. Apply beams during predict if specified eg. ‘beam_$(corr)_$(reim).fits’ or ‘beam_$(CORR)_$(REIM).fits’.

Default: N/A

input_model.beam_l_axis

Determines the orientation of the beam l-axis. Note that ~ indicates flipping the orientation of the axis.

Choices:

X

~X

Y

~Y

L

~L

M

~M

Default: X

input_model.beam_m_axis

Determines the orientation of the beam m-axis. See beam_l_axis.

Choices:

X

~X

Y

~Y

L

~L

M

~M

Default: Y

input_model.invert_uvw

The UVW coordinates will be negated if this option is specified. Enabled by default to match the Casa convention.

Default: True

input_model.source_chunks

The number of sources to predict simultaneously. Has a large impact on memory footprint.

Default: 500

input_model.apply_p_jones

Determines whether P-Jones (parallactic angle rotation) is applied to the model. This affects both measurement set columns and predicted components. Care must taken when using this option and output.apply_p_jones_inv.

Default: False

output

Options pertaining to output.

output.gain_directory

Name of directory in which QuartiCal gain outputs will be stored. Accepts both local and s3 paths. QuartiCal will always produce gain outputs.

Default: gains.qc

output.log_directory

Name of directory in which QuartiCal logging outputs will be stored. s3 is not currently supported for these outputs.

Default: logs.qc

output.log_to_terminal

Enable or disable logging to terminal.

Default: True

output.overwrite

Whether or not the contents of the output directory may be overwritten. Will trigger an error when False and output.directory already exists.

Default: False

output.products

The desired output data products. Multiple data products can be specified as a list e.g. [residual, corrected_residual]. Any required measurement set outputs should be specified here. Note that the output names of the desired products should be specified via output.columns.

Choices:

corrected_data

corrected_residual

residual

weight

corrected_weight

model_data

Default: N/A

output.columns

Output MS column names for visibility outputs (if applicable). Column names will be used in order, matching the order of output.products. Multiple columns can be specified as a list e.g. [COL1, COL2, COL3].

Default: N/A

output.flags

If True, write out flags to FLAG and FLAG_ROW. This can be disabled if you want QuartiCal to leave the measurement set flags unaltered.

Default: True

output.apply_p_jones_inv

Determines whether the inverse of P-Jones (parallactic angle rotation) is applied to the output visibilitites. This has the effect of derotating the output visibilities into the sky frame. Care must taken when using this option and input_model.apply_p_jones.

Default: False

output.subtract_directions

Which model directions to subtract when generating residuals. Must be specified as a list of integers e.g. [0, 5, 7]. The default will subtract all directions.

Default: N/A

output.net_gains

Merge subsets of gains into an effective/net gain per antenna per time per channel. This is formed by multiplying all the specified gains together. This can be used to reduce the computational load of solution transfer by transferring net/effective gains rather than each individual term. Accepts a list or a list of lists e.g. [“K”, “G”, “X”, “B”] or [[“K”, “G”], [“X”, “B”]]. Results will be written to outputs.gain_directory as e.g. KGXB-net.

Default: N/A

output.compute_baseline_corrections

Enable or disable computation of baseline-based corrections. Functionality is currently limited to a solution per-channel, per-chunk. These solutions are useful for analysis and are stored in output.gain_directory.

Default: False

output.apply_baseline_corrections

Enable or disable application of baseline-based corrections. Extreme caution advised - this can and will lead to overfitting.

Default: False

mad_flags

Options pertaining to MAD (Median Absolute Deviation) flagging.

mad_flags.enable

Enables the MAD flagging routines.

Default: False

mad_flags.whitening

Determines whether and how the residuals are whitened (multiplied by the square root of the weights) prior to performing MAD flagging. “native” whitening will use the original weights (specified in the input_ms section). “robust” will use the weights produced when solver.robust=True. “disabled” will result in the MAD estimate being performed on the unwhitened residuals.

Default: disabled

mad_flags.threshold_bl

Multiplicative factor which determines whether or not a chi-squared value is considered to deviate significantly from the median of a given baseline. Values greater than MAD_bl*threshold_bl will be flagged. Set to zero to disable flagging on this statistic.

Default: 5

mad_flags.threshold_global

Multiplicative factor which determines whether or not a chi-squared value is considered to deviate significantly from the median of a given data chunk. Values greater than MAD*threshold_global will be flagged. Set to zero to disable flagging on this statistic.

Default: 10

mad_flags.max_deviation

Multiplicative factor which determines whether or not the MAD estimate on a given baseline is considered to deviate too much from the global MAD estimate. If the MAD estimate over all baselines is X, and the MAD estimate on a specific baseline is X_bl, baselines for which X_bl > max_deviation*X will be flagged. Set to zero to disable flagging on this statistic.

Default: 0

mad_flags.use_off_diagonals

Controls whether or not the mad flagger will be run on the off-diagonal elements of the residual. This is disabled by default as the residual will tend to contain structure in the absence of a polarised model and adequate leakage calibration.

Default: False

dask

Options pertaining to Dask (and therefore parallelism).

dask.threads

Number of threads to use in the dask scheduler. Setting to zero (the default) will use all available resources.

Default: N/A

dask.workers

Number of workers to use in the dask distributed scheduler. Advanced users only.

Default: 1

dask.address

Distributed scheduler address.

Default: N/A

dask.scheduler

Which dask scheduler to use. The default, threads, is the most appropriate for non-cluster use.

Choices:

threads

single-threaded

distributed

Default: threads

dask.scheduler_plugin

Enable or disable the dask scheduler plugin.

Default: True

solver

Options pertaining to all solvers (as opposed to specific terms).

solver.terms

Gain terms for which we are solving. Multiple terms can be specified as a list e.g. [G,B,dE]. Each term specified here has its own set of arguments which can be specified as (gain).(option). e.g. G.time_interval.

Default: [‘G’]

solver.iter_recipe

Specifies the iterations to be performed per gain term. This argument expects a list as long or longer than solver.terms. If solver.terms was given as [K,G,B], an iteration recipe of [20,10,5] would do 20 iterations for K, 10 for G and 5 for B. To loop over the gains multiple times, use a longer list e.g. [20,10,5,15,5,0] would do the same as the first example but then do an additional 15 iterations for K, 5 for G and skip B. Setting to zero effectively disables solving for that term and can be used in conjunction with iterpolation.

Default: [25]

solver.propagate_flags

Controls whether gain flags/flags raised inside the solver are propagated and ultimately written to the FLAG column. This should almost always be enabled so that data associated with diverging gains is properly flagged.

Default: True

solver.robust

Enable robust reweighting in solvers. Note that this only works when the solver.iter_recipe loops through the chain multiple times. The reweighting step only happens once per traversal of the chain.

Default: False

solver.threads

Number of Numba threads per Dask thread (enables nested parallelism) to be used when running the solvers. The total number of threads used will be dask.threads*solver.threads; if this product exceeds the number of available threads, performance will suffer.

Default: 1

solver.convergence_fraction

The fraction of gain values which must converge before a solver will exit prematurely.

Default: 0.99

solver.convergence_criteria

The change in the value of the gain below which it considered to have converged. Set to zero to iterate for the number of interations specified in solver.iter_recipe.

Default: 1e-06

solver.reference_antenna

A reference antenna to use for terms which require one. QuartiCal will also guarantee zero phase on the specified antenna for diagonal terms. Specify as the integer index of the antenna - antenna names are not currently supported.

Default: 0

gain

Options pertaining to a specific gain/Jones term.

Warning

This help is generic - users will not typically write gain.option but will instead use the labels specified by solver.gain_terms. Thus, for solver.gain_terms="[G,B]", options would be specified using G.option or B.option.

gain.type

Type of gain to solve for.

Choices:

complex

diag_complex

amplitude

delay

delay_and_offset

phase

tec_and_offset

rotation_measure

rotation

crosshand_phase

leakage

Default: complex

gain.solve_per

Determines whether this term should be solved per antenna (conventional) or over the entire array (doesn’t vary with antenna).

Choices:

antenna

array

Default: antenna

gain.direction_dependent

Determines whether this term is treated as direction dependent.

Default: False

gain.pinned_directions

If this term is direction dependent, this can be used to exclude integer indexed directions during gain updates i.e. effectively disable updates in the given directions. Accepts a list of integer values e.g. ‘[0,3]’.

Default: [0]

gain.time_interval

Number of timeslots/amount of time to include in a single solution. Specify as an integer number of timeslots, or a value with a unit (e.g. ‘300s’). 0 means use full time axis.

Default: 1

gain.freq_interval

Number of channels/bandwidth to include in a single solution. Specify as an integer number of channels, or a value with a unit (e.g. ‘128MHz’). 0 means use full frequency axis.

Default: 1

gain.load_from

Load solutions from given database.

Default: N/A

gain.interp_mode

Set interpolation mode. THIS OPTION IS CURRENTLY UNAVAILABLE.

Choices:

reim

ampphase

amp

phase

Default: reim

gain.interp_method

Set interpolation method.

Choices:

2dlinear

2dspline

Default: 2dlinear

gain.respect_scan_boundaries

Determines whether solution intervals may span multiple scans. This only works when input_ms.group_by does not include SCAN_NUMBER. Can be used in conjunction with time_interval to solve a term per scan even when data is not partitioned by scan (by setting this to True and time_interval to 0).

Default: True

gain.initial_estimate

Controls whether or not a term will be populated with an initial estimiate where applicable. Currently only supported for delay terms.

Default: False