filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
if just the module name is provided, this will be loaded from the public modules
remote/branch:module format will load a module from a specific git remote and branch
module:remote/branch format will load a module from a specific git remote and branch
location – string with the location where to place the module in the OMFIT tree
withSubmodules – load submodules or not
availableModulesList – list of available modules generated by OMFIT.availableModules()
If this list is not passed, then the availableModulesList is generated internally
checkLicense – Check license files at load
developerMode – Load module with developer mode option (ie. scripts loaded as modifyOriginal)
if None then default behavior is set by OMFIT['MainSettings']['SETUP']['developer_mode_by_default']
Note: public OMFIT installation cannot be loaded in developer mode
depth – parameter used internally by for keeping track of the recursion depth
quiet – load modules silently or not
startup_lib – Used internally for executing OMFITlib_startup scripts
**kw – additional keywords passed to OMFITmodule() class
directories – list of directories to index. If None this is taken from OMFIT[‘MainSettings’][‘SETUP’][‘modulesDir’]
same_path_as – sample OMFITsave.txt path to set directory to index
force – force rebuild of .modulesInfoCache file
Returns:
This method returns a dictionary with the available OMFIT modules.
Each element in the dictionary is a dictionary itself with the details of the available modules.
Parse string representation of the dictionary path and return list including root name
This function can parse things like: OMFIT[‘asd’].attributes[u’aiy’ ][”[ ‘bla’][‘asa’]”][3][1:5]
Parameters:
inv – string representation of the dictionary path
Identifies location in the OMFIT tree of an OMFIT object
NOTE: Typical users should not need to use this function as part of their modules.
If you find yourself using this function in your modules, it is likely that OMFIT
already provides the functionality that you are looking for in some other way.
We recommend reaching out the OMFIT developers team to see if there is an easy
way to get what you want.
Parameters:
obj – object in the OMFIT tree
memo – used internally to avoid infinite recursions
This function is meant to be called in the .save() function of objects of the class
OMFITobject that support dynamic loading. The idea is that if an object has not
been loaded, then its file representation has not changed and the original file can be resued.
This function returns True/False to say if it was successful at saving.
If True, then the original .save() function can return, otherwise it should go through
saving the data from memory to file.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]
If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v
In either case, this is followed by: for k in F: D[k] = F[k]
what – string with the regular expression to be cut across
sort – sorting of results alphabetically
returnKeys – return keys of elements in addition to objects
Returns:
list of objects or tuple with with objects and keys
>> OMFIT[‘test’]=OMFITtree()
>> for k in range(5):
>> OMFIT[‘test’][‘aaa’+str(k)]=OMFITtree()
>> OMFIT[‘test’][‘aaa’+str(k)][‘var’]=k
>> OMFIT[‘test’][‘bbb’+str(k)]=-1
>> print(OMFIT[‘test’].across(“[‘aaa*’][‘var’]”))
key – function that returns a string that is used for sorting or dictionary key whose content is used for sorting
>> tmp=SortedDict()
>> for k in range(5):
>> tmp[k]={}
>> tmp[k][‘a’]=4-k
>> # by dictionary key
>> tmp.sort(key=’a’)
>> # or equivalently
>> tmp.sort(key=lambda x:tmp[x][‘a’])
Parameters:
**kw – additional keywords passed to the underlying list sort command
recursively searches dictionary for key in order to set a value
raises KeyError if key could not be found, so this method cannot
be used to set new entries in the dictionary
Subclassing from this class is like subclassing from the xarray.Dataset class
but without having to deal with the hassle of inheriting from xarrays
(internally this class uses class composition rather than subclassing).
Also this class makes it possible to use the OMFIT dynamic loading capabilities.
All classes that subclass OMFITdataset must define the .dynaLoad attribute.
NOTE: Classes that subclass from OMFITdataset will be identified
as an xarray.Dataset when using isinstance(…, xarray.Dataset)
within OMFIT
Whether to use the tight layout mechanism. See .set_tight_layout.
Discouraged
The use of this parameter is discouraged. Please use
layout='tight' instead for the common case of
tight_layout=True and use .set_tight_layout otherwise.
The use of this parameter is discouraged. Please use
layout='constrained' instead.
layout{‘constrained’, ‘tight’}, optional
The layout mechanism for positioning of plot elements.
Supported values:
‘constrained’: The constrained layout solver usually gives the
best layout results and is thus recommended. However, it is
computationally expensive and can be slow for complex figures
with many elements.
See /tutorials/intermediate/constrainedlayout_guide
for examples.
‘tight’: Use the tight layout mechanism. This is a relatively
simple algorithm, that adjusts the subplot parameters so that
decorations like tick labels, axis labels and titles have enough
space. See .Figure.set_tight_layout for further details.
Properties:
agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image
alpha: scalar or None
animated: bool
canvas: FigureCanvas
clip_box: .Bbox
clip_on: bool
clip_path: Patch or (Path, Transform) or None
constrained_layout: bool or dict or None
constrained_layout_pads: float, default: :rc:`figure.constrained_layout.w_pad`
dpi: float
edgecolor: color
facecolor: color
figheight: float
figure: .Figure
figwidth: float
frameon: bool
gid: str
in_layout: bool
label: object
linewidth: number
path_effects: .AbstractPathEffect
picker: None or bool or float or callable
rasterized: bool
size_inches: (float, float) or float
sketch_params: (scale: float, length: float, randomness: float)
snap: bool or None
tight_layout: bool or dict with keys “pad”, “w_pad”, “h_pad”, “rect” or None
transform: .Transform
url: str
visible: bool
zorder: float
Print all data in line pyplot.plot(s) to text file.The x values
will be taken from the line with the greatest number of
points in the (first) axis, and other lines are interpolated
if their x values do not match. Column labels are the
line labels and xlabel.
Output the contents of the figure (self) to a hdf5 file given by filename
Parameters:
filename – The path and basename of the file to save to (the extension is stripped, and ‘.h5’ is added)
For the purpose of the GA data managment plan, these files can be uploaded directly to https://diii-d.gat.com/dmp
agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image
alpha: scalar or None
animated: bool
canvas: FigureCanvas
clip_box: .Bbox
clip_on: bool
clip_path: Patch or (Path, Transform) or None
constrained_layout: bool or dict or None
constrained_layout_pads: float, default: :rc:`figure.constrained_layout.w_pad`
dpi: float
edgecolor: color
facecolor: color
figheight: float
figure: .Figure
figwidth: float
frameon: bool
gid: str
in_layout: bool
label: object
linewidth: number
path_effects: .AbstractPathEffect
picker: None or bool or float or callable
rasterized: bool
size_inches: (float, float) or float
sketch_params: (scale: float, length: float, randomness: float)
snap: bool or None
tight_layout: bool or dict with keys “pad”, “w_pad”, “h_pad”, “rect” or None
transform: .Transform
url: str
visible: bool
zorder: float
Return the figure canvas for the tab with the given label, creating a new tab if that label does not yet exist.
If fig is passed, then that fig is inserted into the figure.
The matplotlib.cm.ScalarMappable (i.e., ~matplotlib.image.AxesImage,
~matplotlib.contour.ContourSet, etc.) described by this colorbar.
This argument is mandatory for the .Figure.colorbar method but optional
for the .pyplot.colorbar function, which sets the default to the current
image.
Note that one can create a .ScalarMappable “on-the-fly” to generate
colorbars not attached to a previously drawn artist, e.g.
One or more parent axes from which space for a new colorbar axes will be
stolen, if cax is None. This has no effect if cax is set.
use_gridspecbool, optional
If cax is None, a new cax is created as an instance of Axes. If
ax is an instance of Subplot and use_gridspec is True, cax is
created as an instance of Subplot using the gridspec module.
locationNone or {‘left’, ‘right’, ‘top’, ‘bottom’}
The location, relative to the parent axes, where the colorbar axes
is created. It also determines the orientation of the colorbar
(colorbars on the left and right are vertical, colorbars at the top
and bottom are horizontal). If None, the location will come from the
orientation if it is set (vertical colorbars on the right, horizontal
ones at the bottom), or default to ‘right’ if orientation is unset.
orientationNone or {‘vertical’, ‘horizontal’}
The orientation of the colorbar. It is preferable to set the location
of the colorbar, as that also determines the orientation; passing
incompatible values for location and orientation raises an exception.
fractionfloat, default: 0.15
Fraction of original axes to use for colorbar.
shrinkfloat, default: 1.0
Fraction by which to multiply the size of the colorbar.
aspectfloat, default: 20
Ratio of long to short dimensions.
padfloat, default: 0.05 if vertical, 0.15 if horizontal
Fraction of original axes between colorbar and new image axes.
anchor(float, float), optional
The anchor point of the colorbar axes.
Defaults to (0.0, 0.5) if vertical; (0.5, 1.0) if horizontal.
panchor(float, float), or False, optional
The anchor point of the colorbar parent axes. If False, the parent
axes’ anchor will be unchanged.
Defaults to (1.0, 0.5) if vertical; (0.5, 0.0) if horizontal.
colorbar properties:
Property
Description
extend
{‘neither’, ‘both’, ‘min’, ‘max’}
If not ‘neither’, make pointed end(s) for out-of-
range values. These are set for a given colormap
using the colormap set_under and set_over methods.
extendfrac
{None, ‘auto’, length, lengths}
If set to None, both the minimum and maximum
triangular colorbar extensions with have a length of
5% of the interior colorbar length (this is the
default setting). If set to ‘auto’, makes the
triangular colorbar extensions the same lengths as
the interior boxes (when spacing is set to
‘uniform’) or the same lengths as the respective
adjacent interior boxes (when spacing is set to
‘proportional’). If a scalar, indicates the length
of both the minimum and maximum triangular colorbar
extensions as a fraction of the interior colorbar
length. A two-element sequence of fractions may also
be given, indicating the lengths of the minimum and
maximum colorbar extensions respectively as a
fraction of the interior colorbar length.
extendrect
bool
If False the minimum and maximum colorbar extensions
will be triangular (the default). If True the
extensions will be rectangular.
spacing
{‘uniform’, ‘proportional’}
Uniform spacing gives each discrete color the same
space; proportional makes the space proportional to
the data interval.
ticks
None or list of ticks or Locator
If None, ticks are determined automatically from the
input.
format
None or str or Formatter
If None, ~.ticker.ScalarFormatter is used.
If a format string is given, e.g., ‘%.3f’, that is used.
An alternative ~.ticker.Formatter may be given instead.
drawedges
bool
Whether to draw lines at color boundaries.
label
str
The label on the colorbar’s long axis.
The following will probably be useful only in the context of
indexed colors (that is, when the mappable has norm=NoNorm()),
or other unusual circumstances.
Property
Description
boundaries
None or a sequence
values
None or a sequence which must be of length 1 less
than the sequence of boundaries. For each region
delimited by adjacent entries in boundaries, the
colormapped to the corresponding value in values
will be used.
If mappable is a ~.contour.ContourSet, its extend kwarg is included
automatically.
The shrink kwarg provides a simple way to scale the colorbar with respect
to the axes. Note that if cax is specified, it determines the size of the
colorbar and shrink and aspect kwargs are ignored.
For more precise control, you can manually specify the positions of
the axes objects in which the mappable and the colorbar are drawn. In
this case, do not use any of the axes properties kwargs.
It is known that some vector graphics viewers (svg and pdf) renders white gaps
between segments of the colorbar. This is due to bugs in the viewers, not
Matplotlib. As a workaround, the colorbar can be rendered with overlapping
segments:
However this has negative consequences in other circumstances, e.g. with
semi-transparent images (alpha < 1) and colorbar extensions; therefore, this
workaround is not used by default (see issue #1188).
Three integers (nrows, ncols, index). The subplot will take the
index position on a grid with nrows rows and ncols columns.
index starts at 1 in the upper left corner and increases to the
right. index can also be a two-tuple specifying the (first,
last) indices (1-based, and including last) of the subplot, e.g.,
fig.add_subplot(3,1,(1,2)) makes a subplot that spans the
upper 2/3 of the figure.
A 3-digit integer. The digits are interpreted as if given separately
as three single-digit integers, i.e. fig.add_subplot(235) is the
same as fig.add_subplot(2,3,5). Note that this can only be used
if there are no more than 9 subplots.
The projection type of the subplot (~.axes.Axes). str is the name
of a custom projection, see ~matplotlib.projections. The default
None results in a ‘rectilinear’ projection.
polarbool, default: False
If True, equivalent to projection=’polar’.
sharex, sharey~.axes.Axes, optional
Share the x or y ~matplotlib.axis with sharex and/or sharey. The
axis will have the same limits, ticks, and scale as the axis of the
shared axes.
.axes.SubplotBase, or another subclass of ~.axes.Axes
The axes of the subplot. The returned axes base class depends on
the projection used. It is ~.axes.Axes if rectilinear projection
is used and .projections.polar.PolarAxes if polar projection
is used. The returned axes is then a subplot subclass of the
base class.
This method also takes the keyword arguments for the returned axes
base class; except for the figure argument. The keyword arguments
for the rectilinear base class ~.axes.Axes can be found in
the following table but there might also be other keyword
arguments if another projection is used.
Properties:
adjustable: {‘box’, ‘datalim’}
agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image
alpha: scalar or None
anchor: (float, float) or {‘C’, ‘SW’, ‘S’, ‘SE’, ‘E’, ‘NE’, …}
animated: bool
aspect: {‘auto’, ‘equal’} or float
autoscale_on: bool
autoscalex_on: bool
autoscaley_on: bool
axes_locator: Callable[[Axes, Renderer], Bbox]
axisbelow: bool or ‘line’
box_aspect: float or None
clip_box: .Bbox
clip_on: bool
clip_path: Patch or (Path, Transform) or None
facecolor or fc: color
figure: .Figure
frame_on: bool
gid: str
in_layout: bool
label: object
navigate: bool
navigate_mode: unknown
path_effects: .AbstractPathEffect
picker: None or bool or float or callable
position: [left, bottom, width, height] or ~matplotlib.transforms.Bbox
prop_cycle: unknown
rasterization_zorder: float or None
rasterized: bool
sketch_params: (scale: float, length: float, randomness: float)
snap: bool or None
title: str
transform: .Transform
url: str
visible: bool
xbound: unknown
xlabel: str
xlim: (bottom: float, top: float)
xmargin: float greater than -0.5
xscale: {“linear”, “log”, “symlog”, “logit”, …} or .ScaleBase
xticklabels: unknown
xticks: unknown
ybound: unknown
ylabel: str
ylim: (bottom: float, top: float)
ymargin: float greater than -0.5
yscale: {“linear”, “log”, “symlog”, “logit”, …} or .ScaleBase
yticklabels: unknown
yticks: unknown
zorder: float
Creating a new Axes will delete any pre-existing Axes that
overlaps with it beyond sharing a boundary:
importmatplotlib.pyplotasplt# plot a line, implicitly creating a subplot(111)plt.plot([1,2,3])# now create a subplot which represents the top plot of a grid# with 2 rows and 1 column. Since this subplot will overlap the# first, the plot (and its axes) previously created, will be removedplt.subplot(211)
If you do not want this behavior, use the .Figure.add_subplot method
or the .pyplot.axes function instead.
If no kwargs are passed and there exists an Axes in the location
specified by args then that Axes will be returned rather than a new
Axes being created.
If kwargs are passed and there exists an Axes in the location
specified by args, the projection type is the same, and the
kwargs match with the existing Axes, then the existing Axes is
returned. Otherwise a new Axes is created with the specified
parameters. We save a reference to the kwargs which we use
for this comparison. If any of the values in kwargs are
mutable we will not detect the case where they are mutated.
In these cases we suggest using .Figure.add_subplot and the
explicit Axes API rather than the implicit pyplot API.
plt.subplot(221)# equivalent but more generalax1=plt.subplot(2,2,1)# add a subplot with no frameax2=plt.subplot(222,frameon=False)# add a polar subplotplt.subplot(223,projection='polar')# add a red subplot that shares the x-axis with ax1plt.subplot(224,sharex=ax1,facecolor='red')# delete ax2 from the figureplt.delaxes(ax2)# add ax2 to the figure againplt.subplot(ax2)# make the first axes "current" againplt.subplot(221)
It assumes the data points are dense in the x dimension
compared to the screen resolution at all points in the plot.
It will resize when the axes are clicked on.
pythonFile – is meant to be an OMFITpythonGUI object in the OMFIT tree
title – title to appear in the compound GUI frame.
If None, the location of the pythonFile object in the OMFIT tree will be shown.
If an empty string, the compound GUI title is suppressed.
This method creates a GUI element of the entry type
The background of the GUI gets colored green/red depending on whether the input by the user is a valid Python entry
Parameters:
location – location in the OMFIT tree (notice that this is a string)
lbl – Label which is put on the left of the entry
comment – A comment which appears on top of the entry
updateGUI – Force a re-evaluation of the GUI script when this parameter is changed
help – help provided when user right-clicks on GUI element (adds GUI button)
preentry – function to pre-process the data at the OMFIT location to be displayed in the entry GUI element
postcommand – command to be executed after the value in the tree is updated. This command will receive the OMFIT location string as an input
check – function that returns whether what the user has entered in the entry GUI element is a valid entry.
This will make the background colored yellow, and users will not be able to set the value.
default – Set the default value if the tree location does not exist (adds GUI button)
delete_if_default – Delete tree entry if the value is the default value
multiline – Force display of button for multiple-line text entry
norm – normalizes numeric variables (overrides preentry or postcommand)
This method creates a GUI element of the combobox type.
The background of the GUI gets colored green/red depending on whether the input by the user is a valid Python entry
Notice that this method can be used to set multiple entries at once:
ComboBox([“root[‘asd’]”,”root[‘dsa’]”,”root[‘aaa’]”,],{‘’:[0,0,0],’a’:[1,1,0],’b’:[1,0,’***’]},’Test multi’,default=[0,0,0])
which comes very handy when complex/exclusive switch combinations need to be set in a namelist file, for example.
Use the string *** to leave parameters unchanged.
Parameters:
location – location in the OMFIT tree (notice that this is either a string or a list of strings)
options – possible options the user can choose from. This can be a list or a dictionary.
lbl – Label which is put on the left of the entry
comment – A comment which appears on top of the entry
updateGUI – Force a re-evaluation of the GUI script when this parameter is changed
state –
‘readonly’ (default) the user can not type in whatever he wants
’normal’ allow user to type in
’search’ allow searching for entries
help – help provided when user right-clicks on GUI element (adds GUI button)
postcommand – command to be executed after the value in the tree is updated. This command will receive the OMFIT location string as an input
check – function that returns whether what the user has entered in the entry GUI element is a valid entry.
This will make the background colored yellow, and users will not be able to set the value.
default – Set the default value if the tree location does not exist (adds GUI button)
This method creates a GUI element of the filePicker type, which allows to pick a file/directory
Parameters:
location – location in the OMFIT tree (notice that this is a string)
lbl – label to be shown near the button
help – help provided when user right-clicks on GUI element (adds GUI button)
postcommand – command to be executed after the value in the tree is updated. This command will receive the OMFIT location string as an input
updateGUI – Force a re-evaluation of the GUI script when this parameter is changed
localRemote – True: both, ‘local’: only local, ‘remote’: only remote
transferRemoteFile –
controls what goes into location
string with local filename (if transferRemoteFile==True)
string with the filename (if transferRemoteFile==False)
tuple with the filename,server,tunnel (if transferRemoteFile==None)
if transferRemoteFile=True, then the file is transferred to a temporary folder
if transferRemoteFile is a string, then it will be interpreted as the directory where to move the file
directory – whether it’s a directory or a file
action – ‘open’ or ‘save’
tree – load from OMFIT tree location
url – open url in web-browser (adds GUI button)
kwlabel – keywords passed to ttk.Label
init_directory_location – The contents of this location are used to set the initial directory for file searches.
If a file name is specified the directory will be determined from the file name and this input ignored.
Otherwise, if set this will be used to set the initial directory.
init_pattern_location – The default pattern is ‘*’. If this is specified then the contents of the tree location will replace the default intial pattern.
favorite_list_location – OMFIT tree location which contains a possibly empty list of favorite file directories. To keep with the general omfit approach this should be a string.
pattern_list_location – OMFIT tree location which contains a possibly empty list of favorite search patterns. To keep with the general omfit approach this should be a string.
reveal_location – location used for creation of the help (this is used internally by OMFIT, should not be used by users)
This helper method creates a GUI element of the objectPicker type, which allows to load objects in the tree.
If an object is already present at the location, then a button allows picking of a different object.
Notice that this GUI element will always call an updateGUI
Parameters:
location – location in the OMFIT tree (notice that this is a string)
lbl – label to be shown near the button/object picker
objectType – class of the object that one wants to load (e.g. OMFITnamelist, OMFITgeqdsk, …)
if objectType is None then the object selected with Tree is deepcopied
objectKW – keywords passed to the object
postcommand – command to be executed after the value in the tree is updated. This command will receive the OMFIT location string as an input.
unset_postcommand – command to be executed after the value in the tree is deleted. This command will receive the OMFIT location string as an input.
kwlabel – keywords passed to ttk.Label
init_directory_location – The contents of this location are used to set the initial directory for file searches.
If a file name is specified the directory will be determined from the file name and this input ignored.
Otherwise, if set this will be used to set the initial directory.
init_pattern_location – The default pattern is ‘*’. If this is specified then the contents of the tree location will replace the default intial pattern.
favorite_list_location – OMFIT tree location which contains a possibly empty list of favorite file directories. To keep with the general omfit approach this should be a string.
pattern_list_location – OMFIT tree location which contains a possibly empty list of favorite search patterns. To keep with the general omfit approach this should be a string.
**kw – extra keywords are pased to the FilePicker object
This method creates a GUI element of the combobox type for the selection of modules within the OMFIT project.
Parameters:
location – location in the OMFIT tree (notice that this is either a string or a list of strings)
modules – string or list of strings with IDs of the allowed modules. If modules is None all modules in OMFIT are listed
lbl – label to be shown near the combobox
load – list of two elements lists with module name and location where modules can be loaded
eg. [[‘OMFITprofiles’,”root[‘OMFITprofiles’]”],[‘EFIT’,”OMFITmodules[-2][‘EFIT’]”],]
Setting load=True will set loading of the modules as submodules
This method creates a GUI element used to select a tree location
The label of the GUI turns green/red if the input by the user is a valid OMFIT tree entry (non existing tree entries are allowed)
The label of the GUI turns green/red if the input by the user does or doesn’t satisfy the check (non valid tree entries are NOT allowed)
Parameters:
location – location in the OMFIT tree (notice that this is a string)
lbl – Label which is put on the left of the entry
comment – A comment which appears on top of the entry
kwlabel – keywords passed to ttk.Label
default – Set the default value if the tree location does not exist (adds GUI button)
help – help provided when user right-clicks on GUI element (adds GUI button)
url – open url in web-browser (adds GUI button)
updateGUI – Force a re-evaluation of the GUI script when this parameter is changed
postcommand – command to be executed after the value in the tree is updated. This command will receive the OMFIT location string as an input
check – function that returns whether what the user has entered in the entry GUI element is a valid entry
This will make the label colored yellow, and users will not be able to set the value.
base – object in location with respect to which relative locations are evaluated
GUI element to add or remove objects to a list
Note: multiple items selection possible with the Shift and Ctrl keys
Parameters:
location – location in the OMFIT tree (notice that this is a string).
options – possible options the user can choose from. This can be a tree location, a list, or a dictionary.
If a dictinoary, then keys are shown in the GUI and values are set in the list.
In order to use “show_delete_button”, this must be a string giving the location of a list in the tree.
lbl – Label which is put on the left of the entry
default – Set the default value if the tree location does not exist
unique – Do not allow repetitions in the list
ordered – Keep the same order as in the list of options
If false, then buttons to move elements up/down are shown
updateGUI – Force a re-evaluation of the GUI script when this parameter is changed
postcommand – function to be called after a button is pushed. It is called as postcommand(location=location,button=button) where button is in [‘add’,’add_all’,’remove’,’remove_all’]
only_valid_options – list can only contain valid options
help – help provided when user right-clicks on GUI element (adds GUI button)
url – open url in web-browser (adds GUI button)
show_delete_button – bool: Show an additional button for deleting items from the left hand list
This high level GUI allows setting of DEVICE/SHOT/TIME of each module
(sets up OMFIT MainSettings if root[‘SETTINGS’][‘EXPERIMENT’][‘XXX’] is an expression)
Parameters:
postcommand – command to be executed every time device,shot,time are changed (location is passed to postcommand)
showDevice – True/False show device section or list of suggested devices
showShot – True/False show shot section or list with list of suggested shots
showTime – True/False show time section or list with list of suggested times
showRunID – True/False show runID Entry
multiShots – True/False show single/multi shots
multiTimes – True/False show single/multi times
showSingleTime – True/False if multiTimes, still show single time
checkDevice – check if device user input satisfies condition
checkShot – check if shot user input satisfies condition
checkTime – check if time user input satisfies condition
checkRunID – check if runID user input satisfies condition
subMillisecondTime – Allow floats as times
stopIfNotSet – Stop GUI visualization if shot/time/device are not set
INFO : forest green
HIST : dark slate gray
WARNING : DarkOrange2
HELP : PaleGreen4
STDERR : red3
STDOUT : black
DEBUG : gold4
PROGRAM_OUT : blue
PROGRAM_ERR : purple
Standard remote file selection dialog – no checks on selected file.
Parameters:
directory – directory where to start browsing
serverPicker – serverPicker wins over server/tunnel settings
serverpicker=None will reuse latest server/tunnel that the user browsed to
server – server
tunnel – tunnel
pattern – glob regular expression for files selection
default – default filename selection
master – Tkinter master GUI
lockServer – allow users to change server settings
focus – what to focus in the GUI (‘filterDirs’,’filterFiles’)
favorite_list_location – OMFIT tree location which contains a possibly empty list of favorite file directories. To keep with the general omfit approach this should be a string.
pattern_list_location – OMFIT tree location which contains a possibly empty list of favorite search patterns. To keep with the general omfit approach this should be a string.
is_dir – (bool) Whether the requested file is a directory
Opens up a dialogue asking filename, server/tunnel for remote file transfer
This function is mostly used within the framework; for use in OMFIT GUI scripts
please consider using the OMFITx.FilePicker and OMFITx.ObjectPicker functions instead.
Parameters:
parent – Tkinter parent GUI
transferRemoteFile – [True,False,None] if True the remote file is transferred to the OMFITcwd directory
remoteFilename – initial string for remote filename
server – initial string for server
tunnel – initial string for tunnel
init_directory_location – The contents of this location are used to set the initial directory for file searches.
If a file name is specified the directory will be determined from the file name and this input ignored.
Otherwise, if set this will be used to set the initial directory.
init_pattern_location – The default pattern is ‘*’. If this is specified then the contents of the tree location will replace the default intial pattern.
favorite_list_location – OMFIT tree location which contains a possibly empty list of favorite file directories. To keep with the general omfit approach this should be a string.
pattern_list_location – OMFIT tree location which contains a possibly empty list of favorite search patterns. To keep with the general omfit approach this should be a string.
Returns:
is controlled with transferRemoteFile parameter
string with local filename (if transferRemoteFile==True)
string with the filename (if transferRemoteFile==False)
tuple with the filename,server,tunnel (if transferRemoteFile==None)
This function retrieves information from a remote server (like the shell which is running there):
{'ARG_MAX':4611686018427387903,'QSTAT':'','SQUEUE':'/opt/slurm/default/bin/squeue','environment':OMFITenv([]),'id':6216321643098941518,'login':['.cshrc','.login'],'logout':['.logout'],'shell':'csh','shell_path':'/bin/csh','sysinfo':'csh\nARG_MAX=4611686018427387903\nQSTAT=\nSQUEUE=/opt/slurm/default/bin/squeue\necho: No match.'}
Information from the remote server is stored in a dictionary
This function allows execution of commands on the local workstation.
Parameters:
command_line – string to be executed locally
interactive_input – interactive input to be passed to the command
ignoreReturnCode – ignore return code of the command
std_out – if a list is passed (e.g. []), the stdout of the program will be put there line by line
std_err – if a list is passed (e.g. []), the stderr of the program will be put there line by line
quiet – print command to screen or not
arguments – arguments that are passed to the command_line
script – string with script to be executed.
script option substitutes %s with the automatically generated name of the script
if script is a list or a tuple, then the first item should be the script itself and the second should be the script name
use_bang_command – Execute commands via OMFIT_run_command.sh script (useful to execute scripts within a given shell: #!/bin/…)
If use_bang_command is a string, then the run script will take that filename.
Notice that setting use_bang_command=True is not safe for multiple processes running in the same directory.
progressFunction – user function to which the std-out of the process is passed and returns values from 0 to 100 to indicate progress towards completion
extraButtons – dictionary with key/function that is used to add extra buttons to the GUI. The function receives a dictionary with the process std_out and pid
This function allows execution of commands on remote workstations.
It has the logic to check if the remote workstation is the local workstation and in that case executes locally.
Parameters:
server – server to connect and execute the command
command_line – string to be executed remotely (NOTE that if server=’’, the command is executed locally in the local directory)
remotedir – remote working directory, if remote directory does not exist it will be created
tunnel – tunnel to go through to connect to the server
interactive_input – interactive input to be passed to the command
ignoreReturnCode – ignore return code of the command
std_out – if a list is passed (e.g. []), the stdout of the program will be put there line by line
std_err – if a list is passed (e.g. []), the stderr of the program will be put there line by line
quiet – print command to screen or not
arguments – arguments that are passed to the command_line
script – string with script to be executed.
script option substitutes %s with the automatically generated name of the script
if script is a list or a tuple, then the first item should be the script itself and the second should be the script name
forceRemote – force remote connection even if server is localhost
use_bang_command – execute commands via OMFIT_run_command.sh script (useful to execute scripts within a given shell: #!/bin/…)
If use_bang_command is a string, then the run script will take that filename.
Notice that setting use_bang_command=True is not safe for multiple processes running in the same directory.
progressFunction – user function to which the std-out of the process is passed and returns values from 0 to 100 to indicate progress towards completion
extraButtons – dictionary with key/function that is used to add extra buttons to the GUI. The function receives a dictionary with the process std_out and pid
Function to download files/directories from remote server (possibly via tunnel connection)
NOTE: this function relies on rsync.
There is no way to arbitrarily rename files with rsync.
All rsync can do is move files to a different directory.
Parameters:
server – server to connect and execute the command
remote – remote file(s) (string or list strings) to downsync
local – local directory or file to save files to
tunnel – tunnel to go through to connect to the server
ignoreReturnCode – whether to ignore return code of the rsync command
keepRelativeDirectoryStructure – string with common based directory of the remote files to be removed (usually equals remote_dir)
quiet – print command to screen or not
use_scp – (bool) If this flag is True remote_downsync will be executed with “scp” instead of “rsync”. Use for increased download speed. (default: False)
Returns:
return code of the rsync command (or True if keepRelativeDirectoryStructure and ignoreReturnCode and some rsync fail)
This class provides a live IDL session via the pidly module: https://pypi.python.org/pypi/pyIDL/
In practice this class wraps the pidly.IDL session so that it can handle SERVERS remote connections (including tunneling) and directory management the OMFIT way.
The IDL executable is taken from the idl entry of this server under OMFIT[‘MainSettings’][‘SERVER’].
Local and remote working directories are specified in root[‘SETTINGS’][‘SETUP’][‘workDir’] and root[‘SETTINGS’][‘REMOTE_SETUP’][‘workDir’].
Server and tunnel are specified in root[‘SETTINGS’][‘REMOTE_SETUP’][‘server’] and root[‘SETTINGS’][‘REMOTE_SETUP’][‘tunnel’].
If the tunnel is an empty string, the connection to the remote server is direct. If server is an empty string, everything will occur locally and the remote working directory will be ignored.
Parameters:
module_root – root of the module (e.g. root)
server – override module server
tunnel – override module tunnel
executable – override the executable is taken from the idl entry of this server under OMFIT[‘MainSettings’][‘SERVER’].
workdir – override module local working directory
remotedir – override module remote working directory
clean –
clear local/remote working directories
”local”: clean local working directory only
”local_force”: force clean local working directory only
”remote”: clean remote working directory only
”remote_force”: force clean remote working directory only
Function used to upload files from the local working directory to remote IDL directory
Parameters:
inputs – list of input objects or path to files, which will be deployed in the local or remote working directory.
To deploy objects with a different name one can specify tuples (inputObject,’deployName’)
ignoreReturnCode – whether to ignore return code of the rsync command
High level function to simplify initialization of directories within a module. This function will:
1) Create and clear local and remote working directories
2) Change directory to local working directory
Server and tunnel are specified in root[‘SETTINGS’][‘REMOTE_SETUP’][‘server’] and root[‘SETTINGS’][‘REMOTE_SETUP’][‘tunnel’]
Local and remote working directories are specified in root[‘SETTINGS’][‘SETUP’][‘workDir’] and root[‘SETTINGS’][‘REMOTE_SETUP’][‘workDir’]
Parameters:
module_root – root of the module
server – string that overrides module server
tunnel – string that overrides module tunnel
workdir – string that overrides module local working directory
remotedir – string that overrides module remote working directory
clean –
clear local/remote working directories
”local”: clean local working directory only
”local_force”: force clean local working directory only
”remote”: clean remote working directory only
”remote_force”: force clean remote working directory only
True: clean both
”force”: force clean both
False: clean neither
quiet – print command to screen or not
Returns:
strings for local and remote directories (None if there was a problem in either one)
High level function that simplifies local/remote execution of software within a module.
This function will:
1. cd to the local working directory
2. Clear local/remote working directories [True] by default
3. Deploy the the “input” objects to local working directory
4. Upload files them remotely
5. Executes the software
6. Download “output” files to local working directory
Executable command is specified in root[‘SETTINGS’][‘SETUP’][‘executable’]
Local and remote working directories are specified in root[‘SETTINGS’][‘SETUP’][‘workDir’] and root[‘SETTINGS’][‘REMOTE_SETUP’][‘workDir’].
Server and tunnel are specified in root[‘SETTINGS’][‘REMOTE_SETUP’][‘server’] and root[‘SETTINGS’][‘REMOTE_SETUP’][‘tunnel’].
If the tunnel is an empty string, the connection to the remote server is direct. If server is an empty string, everything will occur locally and the remote working directory will be ignored.
Parameters:
module_root – root of the module (e.g. root) used to set default values for ‘executable’,’server’,’tunnel’,’workdir’, ‘remotedir’
if module_root is None or module_root is OMFIT then
‘executable’,’server’,’tunnel’,’workdir’, ‘remotedir’ must be specified
inputs – list of input objects or path to files, which will be deployed in the local or remote working directory.
To deploy objects with a different name one can specify tuples (inputObject,’deployName’)
outputs – list of output files which will be fetched from the remote directory
clean –
clear local/remote working directories
”local”: clean local working directory only
”local_force”: force clean local working directory only
”remote”: clean remote working directory only
”remote_force”: force clean remote working directory only
True: clean both [DEFAULT]
”force”: force clean both
False: clean neither
arguments – arguments which will be passed to the executable
interactive_input – interactive input to be passed to the executable
server – override module server
tunnel – override module tunnel
executable – override module executable
workdir – override module local working directory
remotedir – override module remote working directory
ignoreReturnCode – ignore return code of executable
std_out – if a list is passed (e.g. []),
the stdout of the program will be put there line by line;
if a string is passed and bool(queued), this should indicate
the path of the file that gives the stdout of the queued job
std_err – if a list is passed (e.g. []),
the stderr of the program will be put there line by line;
if a string is passed and bool(queued), this should indicate
the path of the file that gives the stdout of the queued job
quiet – if True, suppress output to the command box
keepRelativeDirectoryStructure – [True/False] keep relative directory structure of the remote files
script – string with script to be executed. script option requires
%s in the command line location where you want the script filename to appear
if script is a list or a tuple, then the first item should be the script itself and the second should be the script name
forceRemote – force remote connection even if server is localhost
progressFunction – user function to which the std-out of the process is
passed and returns values from 0 to 100 to indicate progress towards completion
queued – If cast as bool is True, invokes manage_job, using queued as
qsub_findID keyword of manage_job, and also takes over std_out and std_err
use_bang_command – Execute commands via OMFIT_run_command.sh script (useful to execute scripts within a given shell: #!/bin/…)
If use_bang_command is a string, then the run script will take that filename.
Notice that setting use_bang_command=True is not safe for multiple processes running in the same directory.
extraButtons – dictionary with key/function that is used to add extra buttons to the GUI. The function receives a dictionary with the process std_out and pid
xterm – if True, launch the command in its own xterm
clean_after – (bool) If this flag is True, the remote directory will be removed once the outputs have been transferred to the local working directory. The remote directory have OMFIT in it’s name. (default: False)
use_scp – (bool) If this flag is True, the remote downsync of data will use the “scp” command instead of “rsync”. This should be used for increased download speed. (default: False)
Execute a Python target_function that is self-contained in a Python python_script,
Useful to execute a Python module as a separate process on a local (or remote) workstation.
This fuction relies on the OMFITx.executable function and additional keyword arguments are passed to it.
Parameters:
module_root – root of the module (e.g. root) used to set default values for ‘executable’,’server’,’tunnel’,’workdir’, ‘remotedir’
if module_root is None or module_root is OMFIT then
‘executable’,’server’,’tunnel’,’workdir’, ‘remotedir’ must be specified
python_script – OMFITpythonTask (or string) to execute
target_function – function in the python_script that will be called
namespace – dictionary with variables passed to the target_function
executable – python executable (if None then is set based on SERVER)
forceRemote – force remote connection even if server is localhost
pickle_protocol – pickle protocol version (use 2 for Python2/3 compatibility)
clean_local – (bool) If Flag is True, this cleans and deletes the local working directory after result to be returned has been loaded into memory. The directory must have OMFIT somewhere in the name as a safety measure. (default: False).
**kw – additional arguments are passed to the underlying OMFITx.executable function
Wait for a job to finish and follow the output file and print the error file that the job generates
Parameters:
output_file – output file that will be followed until the job ends (on the std_out)
error_file – error file that will be printed when the job ends (on the std_err)
progressFunction – user function to which the std-out of the process is passed and returns values from 0 to 100 to indicate progress towards completion
extraButtons – dictionary with key/function that is used to add extra buttons to the GUI. The function receives a dictionary with the process std_out and pid
module_root – The module instance from which servers, etc. are culled
batch_command – A multi-line string or list of strings that should be executed
environment – A string to be executed to set up the environment before launching the batch job
partition – A string to be inserted into the batch script that indicates
which partition(s) (comma separated) to run on; if None, execute batch_command serially
partition_flag – A string to be inserted before the partion names which matches the
system configuration (e.g. -p, –qos)
nproc_per_task – Number of processors to be used by each line of batch_command
job_time – Max wall time of each line of batch_command - see sbatch –time option (default 1 minute)
memory – Max memory usage of each cpu utilized by batch_command - see sbatch –mem-per-cpu option (default 2GB)
batch_type – Type of batch system (SLURM, PBS)
batch_option –
A string specifying any additional batch options in the file header;
It is inserted in raw form after the other batch options, so should include #SBATCH or #PBS
if it is a batch type option, and it could be a multiline string of options
(expected to contain the relevant #{SBATCH,PBS})
out_name – Name used for the output and error files
**kw – All other keywords are passed to OMFITx.executable
module_root – The module instance from which servers, etc. are culled
batch_lines – A multi-line string or list of strings that should be executed in parallel
environment – A string to be executed to set up the environment before launching the batch job
partition – A string to be inserted into the batch script that indicates
which partition(s) (comma separated) to run on; if None, execute batch_lines serially
partition_flag – A string to be inserted before the partion names which matches the
system configuration (e.g. -p, –qos)
nproc_per_task – Number of processors to be used by each line of batch_lines
job_time – Max wall time of each line of batch_lines - see sbatch –time option (default 1 minute)
memory – Max memory usage of each cpu utilized by batch_lines - see sbatch –mem-per-cpu option (default 2GB)
batch_type – Type of batch system (SLURM, PBS)
batch_option –
A string specifying any additional batch options in the file header;
It is inserted in raw form after the other batch options, so should include #SBATCH or #PBS
if it is a batch type option, and it could be a multiline string of options
(expected to contain the relevant #{SBATCH,PBS})
**kw – All other keywords are passed to OMFITx.executable
High level function that simplifies local/remote execution of Integrated Development Environments (IDEs)
This function will:
1. cd to the local working directory
2. Clear local/remote working directories
3. Deploy the the “input” objects to local working directory
4. Upload files them remotely
5. Executes the IDE
6. Download “output” files to local working directory
The IDE executable, server and tunnel, and local and remote working directory depend on the MainSettings[‘SERVER’][ide]
If the tunnel is an empty string, the connection to the remote server is direct. If server is an empty string, everything will occur locally and the remote working directory will be ignored.
Parameters:
module_root – root of the module (e.g. root) or OMFIT itself
ide – what IDE to execute (e.g. idl or matlab)
inputs – list of input objects or path to files, which will be deployed in the local or remote working directory.
To deploy objects with a different name one can specify tuples (inputObject,’deployName’)
outputs – list o output files which will be fetched from the remote directory
clean –
clear local/remote working directories
”local”: clean local working directory only
”local_force”: force clean local working directory only
”remote”: clean remote working directory only
”remote_force”: force clean remote working directory only
True: clean both
”force”: force clean both
False: clean neither
arguments – arguments which will be passed to the executable
interactive_input – interactive input to be passed to the executable
server – override module server
tunnel – override module tunnel
executable – override module executable
workdir – override module local working directory
remotedir – override module remote working directory
ignoreReturnCode – ignore return code of executable
std_out – if a list is passed (e.g. []), the stdout of the program will be put there line by line
std_err – if a list is passed (e.g. []), the stderr of the program will be put there line by line
script – string with script to be executed. script option requires that %s in the command line location where you want the script filename to appear
High level function that simplifies archival of simulations files
Parameters:
module_root – root of the module (e.g. root)
server – override module server
tunnel – override module tunnel
storedir – directory where ZIP files are stored
if storedir is None, then this will be sought under module_root[‘SETTINGS’][‘REMOTE_SETUP’][‘storedir’]
and finally under SERVER[server][‘storedir’]
store_command – (optional) user-defined store command issued to store data (eg. for HPSS usage).
Strings {remotedir} {storedir} and {filename} are substituted with actual remotedir, storedir and filename
if store_command is None, then this will be sought under module_root[‘SETTINGS’][‘REMOTE_SETUP’][‘store_command’]
and finally under SERVER[server][‘store_command’]
restore_command – (optional) user-defined restore command issued to restore data (eg. for HPSS usage).
Strings {remotedir} {storedir} and {filename} are substituted with actual remotedir, storedir and filename
if restore_command is None, then this will be sought under module_root[‘SETTINGS’][‘REMOTE_SETUP’][‘restore_command’]
and finally under SERVER[server][‘restore_command’]
remotedir – remote directory to archive (usually: root[‘SETTINGS’][‘REMOTE_SETUP’][‘workdir’])
This parameter needs to be specified because the working directory can change.
filename – filename to be used for archival
quiet – print store process to screen
background – put creation of ZIP archive in background (ignored if store_commnad is used)
force – force store even if remotedir does not have OMFIT substring in it
remotedir – remote directory to deflate to (usually: root[‘SETTINGS’][‘REMOTE_SETUP’][‘workdir’])
This parameter needs to be specified because the working directory can change.
quiet – print restore process to screen
background – put restore of ZIP archive in background (ignored if restore_commnad is used)
force – force restore even if remotedir does not have OMFIT substring in it
function that sanitizes user input tokamak in a format that is recognized by other codes
Parameters:
tokamak – user string of the tokamak
output_style – format of the tokamak used for the output one of [‘OMFIT’,’TRANSP’,’GPEC’]
allow_not_recognized – allow a user to enter a tokamak which is not recognized
translation_dict – dictionary used for further translation. This is handy for example in
situations where we want to get the same string back independently of
whether it is a older tokamak or its upgraded version. For example
tokamak(‘NSTX-U’, translation_dict={‘NSTXU’:’NSTX’})
Returns a dictionary of information that is specific to a particular tokamak
Parameters:
device – The name of a tokamak. It will be evaluated with the tokamak() function so variation in spelling and
capitalization will be tolerated. This function has some additional translation capabilities for associating
MDSplus servers with tokamaks; for example, “EAST_US”, which is the entry used to access the eastdata.gat.com
MDSplus mirror, will be translated to EAST.
Returns:
A dictionary with as many static device measurements as are known
See: O. Sauter, et al., Phys. Plasmas 6, 2834 (1999); doi:10.1063/1.873240
Neoclassical conductivity appears in equations: 5, 7, 13a, and unnumbered equations in the conclusion
Other references:
S Koh, et al., Phys. Plasmas 19, 072505 (2012); doi: 10.1063/1.4736953
for dealing with ion charge number when there are multiple species
T Osborne, “efit.py Kinetic EFIT Method”, private communication (2013);
this is a word file with a description of equations used to form the current profile constraint
O Sauter, et al., Phys. Plasmas 9, 5140 (2002); doi:10.1063/1.1517052
this has corrections for Sauter 1999 but it also has a note on what Z to use in which equations; it argues that ion equations should use the
charge number of the main ions for Z instead of the ion effective charge number from Koh 2012
Accurate neoclassical resistivity, bootstrap current and other
transport coefficients (Fortran 90 subroutines and matlab functions): has some code that was used to check
the calculations in this script (BScoeff.m, nustar.m, sigmaneo.m, jdotB_BS.m)
Update August 2021add new set of analytical formulae for the computation of the neoclassical condactivity from
A.Redl, et al., Phys. Plasma 28, 022502 (2021) https://doi.org/10.1063/5.0012664 and all relevant variables are mentioned as neo_2021
This function was initially written as part of the Kolemen Group Automatic Kinetic EFIT Project (auto_kEFIT).
Parameters:
psi_N – position basis for all profiles, required only for plotting (normalized poloidal magnetic flux)
Te – electron temperature in eV as a function of time and position (time should be first axis, then position)
ne – electron density in m^-3 (vs. time and psi)
Ti – ion temperature in eV
Zeff – [optional if nis and Zis are provided] effective charge state of ions
= sum_j(n_j (Z_j)^2)/sum_j(n_j Z_j) where j is ion species (this is probably a sum over deuterium and carbon)
nis – [optional if Zeff is provided] list of ion densities in m^-3
Zis – [optional if Zeff is provided] ion charge states (list of scalars)
Zdom – [might be optional] specify the charge number of the dominant ion species. Defaults to the one with the
highest total number of particles (volume integral of ni). If using the estimation method where only Zeff is
provided, then Zdom is assumed to be 1 if not provided.
q – safety factor
eps – inverse aspect ratio
R – major radius of the geometric center of each flux surface
fT – trapped particles fraction
volume – [not needed if Zdom is provided, unlikely to cause trouble if not provided even when “needed”] volume
enclosed by each flux surface, used to identify dominant ion species if dominant ion species is not defined
explicitly by doing a volume integral (so we need this so we can get dV/dpsiN). If volume is needed but not
provided, it will be crudely estimated. Errors in the volume integral are very unlikely to affect the selection
of the dominant ion species (it would have to be a close call to begin with), so it is not critical that volume
be provided with high quality, if at all.
return_info_pack – Boolean: If true, returns a dictionary full of many intermediate variables from this
calculation instead of just conductivity
plot_slice – Set to the index of the timeslice to plot in order to plot one timeslice of the calculation,
including input profiles and intermediate quantities. Set to None for no plot (default)
sigma_compare – provide a conductivity profile for comparison in Ohm^-1 m^-1
sigma_compare_label – plot label to use with sigma_compare
spitzer_compare – provide another conductivity profile for comparison (so you can compare neoclassical and
spitzer) (Ohm^1 m^1)
spitzer_compare_label – plot label to use with spitzer_compare
charge_number_to_use_in_ion_collisionality –
instruction for replacing single ion species charge number Z in
nuistar equation when going to multi-ion species plasma.
Options are: [‘Koh’, ‘Dominant’, ‘Zeff’, ‘Zavg’, ‘Koh_avg’]
Dominant uses charge number of ion species which contributed the most electrons (recommended by Sauter 2002)
Koh uses expression from Koh 2012 page 072505-11 evaluated for dominant ion species (recommended by Koh 2012)
Koh_avg evaluates Koh for all ion species and then averages over species
Zeff uses Z_eff (No paper recommends using this but it appears to be used by ONETWO)
Zavg uses ne/sum(ni) (Koh 2012 recommends using this except for collision frequency)
Use Koh for best agreement with TRANSP
charge_number_to_use_in_ion_lnLambda –
instruction for replacing single ion species charge number Z in
lnLambda equation when going to multi-ion species plasma.
Options are: [‘Koh’, ‘Dominant’, ‘Zeff’, ‘Zavg’, ‘Koh_avg’]
Use Koh for best agreement with TRANSP
Returns:
neoclassical conductivity in (Ohm^-1 m^-1) as a function of time and input psi_N
(after interpolation/extrapolation).
If output with “return_info_pack”, the return is a dictionary containing several intermediate variables which
are used in the calculation (collisionality, lnLambda, etc.)
WRAPPER FOR nclass_conductivity THAT EXTRACTS GFILE STUFF AND INTERPOLATES FOR YOU
Calculation of neoclassical conductivity
See: O. Sauter, et al., Phys. Plasmas 6, 2834 (1999); doi:10.1063/1.873240
Neoclassical conductivity appears in equations: 5, 7, 13a, and unnumbered equations in the conclusion
This function was initially written as part of the Kolemen Group Automatic Kinetic EFIT Project (auto_kEFIT).
Parameters:
psi_N – position basis for all non-gfile profiles
Te – electron temperature in eV as a function of time and position (time should be first axis, then position)
ne – electron density in m^-3 (vs. time and psi)
Ti – ion temperature in eV
Zeff – [optional if nis and Zis are provided] effective charge state of ions
= sum_j(n_j (Z_j)^2)/sum_j(n_j Z_j) where j is ion species (this is probably a sum over deuterium and carbon)
nis – [optional if Zeff is provided] list of ion densities in m^-3
Zis – [optional if Zeff is provided] ion charge states (list of scalars)
Zdom – [might be optional] specify the charge number of the dominant ion species. Defaults to the one with the
highest total number of particles (volume integral of ni). If using the estimation method where only Zeff is
provided, then Zdom is assumed to be 1 if not provided.
gEQDSK – an OMFITcollection of g-files or a single g-file as an instance of OMFITgeqdsk
return_info_pack – Boolean: If true, returns a dictionary full of many intermediate variables from this
calculation instead of just conductivity
plot_slice – Set to the index of the timeslice to plot in order to plot one timeslice of the calculation,
including input profiles and intermediate quantities. Set to None for no plot (default)
charge_number_to_use_in_ion_collisionality –
instruction for replacing single ion species charge number Z in
nuistar equation when going to multi-ion species plasma.
Options are: [‘Koh’, ‘Dominant’, ‘Zeff’, ‘Zavg’, ‘Koh_avg’]
Dominant uses charge number of ion species which contributed the most electrons (recommended by Sauter 2002)
Koh uses expression from Koh 2012 page 072505-11 evaluated for dominant ion species (recommended by Koh 2012)
Koh_avg evaluates Koh for all ion species and then averages over species
Zeff uses Z_eff (No paper recommends using this but it appears to be used by ONETWO)
Zavg uses ne/sum(ni) (Koh 2012 recommends using this except for collision frequency)
Use Koh for best agreement with TRANSP
charge_number_to_use_in_ion_lnLambda –
instruction for replacing single ion species charge number Z in
lnLambda equation when going to multi-ion species plasma.
Options are: [‘Koh’, ‘Dominant’, ‘Zeff’, ‘Zavg’, ‘Koh_avg’]
Use Koh for best agreement with TRANSP
Returns:
neoclassical conductivity in (Ohm^-1 m^-1) as a function of time and input psi_N (after
interpolation/extrapolation).
If output with “return_info_pack”, the return is a dictionary containing several intermediate variables which
are used in the calculation (collisionality, lnLambda, etc.)
See: O. Sauter, et al., Phys. Plasmas 6, 2834 (1999); doi:10.1063/1.873240
Other references:
S Koh, et al., Phys. Plasmas 19, 072505 (2012); doi: 10.1063/1.4736953
for dealing with ion charge number when there are multiple species
T Osborne, “efit.py Kinetic EFIT Method”, private communication (2013);
this is a word file with a description of equations used to form the current profile constraint
O Sauter, et al., Phys. Plasmas 9, 5140 (2002); doi:10.1063/1.1517052
this has corrections for Sauter 1999 but it also has a note on what Z to use in which equations; it argues that ion equations should use the
charge number of the main ions for Z instead of the ion effective charge number from Koh 2012
Accurate neoclassical resistivity, bootstrap current and other
transport coefficients (Fortran 90 subroutines and matlab functions): has some code that was used to check
the calculations in this script (BScoeff.m, nustar.m, sigmaneo.m, jdotB_BS.m)
Y R Lin-Liu, et al., “Zoo of j’s”, DIII-D physics memo (1996);
got hardcopy from Sterling Smith & photocopied
Update August 2021add new set of analytical formulae for the computation of the neoclassical condactivity from
A.Redl, et al., Phys. Plasma 28, 022502 (2021) https://doi.org/10.1063/5.0012664 and all relevant variables are mentioned as neo_2021
This function was initially written as part of the Kolemen Group Automatic Kinetic EFIT Project (auto_kEFIT).
Parameters:
psi_N – normalized poloidal magnetic flux as a position coordinate for input profiles Te, Ti, ne, etc.
Te – electron temperature in eV, first dimension: time, second dimension: psi_N
Ti – ion temperature in eV, 2D with dimensions matching Te (time first)
ne – electron density in m^-3, dimensions matching Te
p – total pressure in Pa, dimensions matching Te
Zeff – [optional if nis and Zis are provided] effective charge state of ions
= sum_j(n_j (Z_j)^2)/sum_j(n_j Z_j) where j is ion species (this is probably a sum over deuterium and carbon)
nis – [optional if Zeff is provided] list of ion densities in m^-3
Zis – [optional if Zeff is provided] ion charge states (list of scalars)
R0 – [optional if device is provided and recognized] The geometric center of the tokamak’s vacuum vessel in m.
(For DIII-D, this is 1.6955 m (Osborne, Lin-Liu))
device – [used only if R0 is not provided] The name of a tokamak for the purpose of looking up R0
gEQDSKs –
a collection of g-files from which many parameters will be derived. The following quantities are
taken from g-files if ANY of the required ones are missing:
param psi_N_efit:
[optional] psi_N for the EFIT quantities if different from psi_N for kinetic profiles
param nt:
[optional] number of time slices in equilibrium data (if you don’t want to tell us, we will measure
the shape of the array)
param psiraw:
poloidal flux before normalization (psi_N is derived from this).
param R:
major radius coordinate R of each flux surface’s geometric center in m
param q:
safety factor (inverse rotation transform)
param eps:
inverse aspect ratio of each flux surface: a/R
param fT:
trapped particle fraction on each flux surface
param I_psi:
also known as F = R*Bt, averaged over each flux surface
version –
which quantity to return:
‘jB_fsa’ is the object directly from Sauter’s paper: 2nd term on RHS of last equation in conclusion.
‘osborne’ is jB_fsaw/|I_psi| replaced by R0. Motivated by memo from T. Osborne about kinetic EFITs
‘jboot1’ is 2nd in 1st equation of conclusion of Sauter 1999 w/ correction from Sauter 2002 erratum.
‘jboot1BROKEN’ is jboot1 without correction from Sauter 2002 (THIS IS FOR TESTING/COMPARISON ONLY)
‘neo_2021’ a new set of analytical coefficients from A.Redl, et al. (the new set of analytical formulae consists of the same analytical structure as the ‘jboot1’ and ‘jboot1BROKEN’ )
You should use jboot1 if you want <J.B>
You should use osborne if you want J *** Put this into current_to_efit_form() to make an EFIT
You should use jboot1 or jB_fsa to compare to Sauter’s paper, equations 1 and 2 of the conclusion
You should use jboot1BROKEN to compare to Sauter 1999 without the 2002 correction
debug_plots – plot internal quantities for debugging
return_units – If False: returns just the current profiles in one 2D array. If True: returns a 3 element tuple
containing the current profiles, a plain string containing the units, and a formatted string containing the
units
return_package – instead of just a current profile, return a dictionary containing the current profile as well
as other information
charge_number_to_use_in_ion_collisionality –
instruction for replacing single ion species charge number Z in
nuistar equation when going to multi-ion species plasma.
Options are: [‘Koh’, ‘Dominant’, ‘Zeff’, ‘Zavg’, ‘Koh_avg’]
Dominant uses charge number of ion species which contributed the most electrons (recommended by Sauter 2002)
Koh uses expression from Koh 2012 page 072505-11 evaluated for dominant ion species (recommended by Koh 2012)
Koh_avg evaluates Koh for all ion species and then averages over species
Zeff uses Z_eff (No paper recommends using this but it appears to be used by ONETWO)
Zavg uses ne/sum(ni) (Koh 2012 recommends using this except for collision frequency)
Use Koh for best agreement with TRANSP
Use Zavg for best agreement with recommendations by Koh 2012
charge_number_to_use_in_ion_lnLambda –
instruction for replacing single ion species charge number Z in
lnLambda equation when going to multi-ion species plasma.
Options are: [‘Koh’, ‘Dominant’, ‘Zeff’, ‘Zavg’, ‘Koh_avg’]
Use Koh for best agreement with TRANSP
Use Koh for best agreement with recommendations by Koh 2012
Return jB:
flux surface averaged j_bootstrap * B with some modifications according to which version you select
Return units:
[only if return_units==True] a string with units like “A/m^2”
Return units_format:
[only if return_units==True] a TeX formatted string with units like “$A/m^2$”
(can be included in plot labels directly)
This is first equation in the conclusion of Sauter 1999 (which is equation 5 with stuff plugged in)
(with correction from the erratum (Sauter 2002):
In both equations, the first term is ohmic current and the second
term is bootstrap current. The second equation uses some
approximations which make the result much smoother. The first
equation had an error in the original Sauter 1999 paper that was
corrected in the 2002 erratum.
< > denotes a flux surface average (mentioned on page 2835)
j_par is the parallel current (parallel to the magnetic field B) (this is what we’re trying to find)
B is the total magnetic field
sigma_neo is the neoclassical conductivity given by equation 7 on page 2835 or equation 13 on page 2837
(this is mentioned as neoclassical resistivity on page 2836, but the form of the
equation clearly shows that it is conductivity, the reciprocal of resistivity.
Also the caption of figure 2 confirms that conductivity is what is meant.)
E_par is the parallel electric field
I(psi) = R * B_phi (page 2835)
p_e is the electron pressure
L_31, L_32, and L_34 are given by equations 8, 9, and 10 respectively (eqns on page 2835).
Also they are given again by eqns 14-16 on pages 2837-2838
p is the total pressure
d_psi() is the derivative with respect to psi (not psi_N)
T_e is the electron temperature
alpha is given by equation 11 on page 2835 or by eqn 17a on page 2838
T_i is the ion temperature
R_pe = p_e/p
f_T the trapped particle fraction appears in many equations and is given by equation 12 on page 2835
but also in equation 18b with nu_i* in equation 18c
Estimate the profile of ohmic current using total current, the profile of bootstrap and driven current, and
neoclassical conductivity. The total Ohmic current profile is calculated by integrating bootstrap and driven current
and subtracting this from the total current. The Ohmic current profile is assigned assuming flat loop voltage and
the total is scaled to match the estimated total Ohmic current.
All inputs should be on the same coordinate system with the same dimensions, except itot, ibs, and idriven should
lack the position axis. If inputs have more than one dimension, position should be along the axis with index = 1
(the second dimension).
This function was initially written as part of the Kolemen Group Automatic Kinetic EFIT Project (auto_kEFIT).
Parameters:
cx_area – Cross sectional area enclosed by each flux surface as a function of psin in m^2
sigma – Neoclassical conductivity in Ohm^-1 m^-1
itot – Total plasma current in A
jbs – [optional if ibs is provided] Bootstrap current density profile in A/m^2.
If this comes from sauter_bootstrap(), the recommended version is ‘osborne’
ibs – [optional if jbs is provided] Total bootstrap current in A
jdriven – [optional if idriven is provided] Driven current density profile in A/m^2
idriven – [optional if jdriven is provided] Total driven current in A
Returns:
Ohmic current profile as a function of psin in A/m^2
Standardizes gas species names that could come from different/unknown sources.
These include common impurity gas molecules, so nitrogen and deuterium translate
to N2 and D2, since they form diatomic molecules.
For example, N2, N$_2$, N_2, and nitrogen all mean the same thing. This function
should accept any one of those and turn it into the format you request.
Intended to handle exotic capitaliation; e.g. corrects NE or ne into Ne.
Parameters:
name – str
The name of the species
output_form –
str
simple: the shortest, simplest, unambiguous and correct form. E.g.: N2
latex: latex formatting for subscripts. E.g.: N$_2$
name: the name of the species. E.g.: nitrogen
markup: symbol with punctuation like _ to indicate underscores, but no $. E.g.: N_2
atom: just the symbol for the atom without molecular information. E.g.: N
This isn’t recommended as an output format; it’s here to simplify lookup in case
someone indicates nitrogen gas by N instead of N2. Doesn’t work for
mixed-element molecules.
on_fail –
str
Behavior on lookup failure.
raise: raise OMFITexception
print: print an error message and return None
quiet: quietly return None
Retrieves information on the gas species loaded into DIII-D gas injector(s)
Parameters:
shot – int
Shot to look up. The species loaded into each valve can change from shot to shot.
valve – str [optional]
Name of the gas valve or injector, like ‘A’ or ‘PFX1’.
‘GASA’ and ‘A’ are interchangeable. The valve name is not case sensitive.
If provided, returns a string with the gas species for this valve, or raises OMFITexception on failure.
If not provided, returns a dictionary of all the gas valves and species in the database.
name_format –
str [optional]
Reformat the name of the gas species to conform to a preferred standard. Options (with examples) are:
Set to None to get the string as written in the database.
Returns:
str or dict
Either the species of a specific valve, or a dictionary of all recorded valves and species.
Either the return value itself or the dictionary values will be strings that are padded with spaces.
If a valve is not in use, its gas species will be recorded as ‘None ‘.
Calculates EAST gas injection amounts based on tank pressure
Whereas DIII-D gas commands in V are often close to proportional to gas flows
(because they’re really inputs to an external flow controller built into the
valve assembly), EAST gas commands are much more raw. An EAST command of 5 V
is NOT roughly double a command of 2.5 V, for example. 2.5 V might not even
open the valve, and there’d be no way to be sure from just the command. There
is no flow sensor inside the EAST injectors like there is at DIII-D (the
source of the gasb_cal measurement instead of the gascgasb command). Lastly,
the EAST reservoirs behind the injectors are small and so the flow rate vs.
voltage is probably not constant. The “tank” here isn’t really a huge tank of
gas, it’s a length of pipe between the big tank of gas and the injector itself.
So, letting out enough gas to control a shot really can affect it significantly.
To get an actual measurement, we can turn to the pressure in the gas tank that
feeds the injector and watch its decrease over time to get a flow rate. This
script doesn’t calculate a flow rate, it provides the integral of the flow rate.
Since the data are noisy, some smoothing is recommended before differentiation
to find the flow rate; we leave the choice of smoothing strength to the end user.
Limitations:
1. COOLING: We assumed constant temperature so far, but the gas tank clearly
is cooled by letting gas out because the pressure very slowly rebounds after
the shot, presumably from the tank warming up again. Remember the “tank” is
actually just a little length of pipe behind the injector. To see how much
error this causes, just look at how much tank pressure rebounds after seeding
stops. The time history usually extends long after the shot. It seems like a
small enough error that it hasn’t been corrected yet.
2. NEEDS POST-PROCESSING: most users are probably interested in flow rates
and will have to take derivatives of the outputs of this function to get them,
including smoothing to defeat noise in the signals.
3. INCOMPLETE INFO: the electrons_per_molecule_puffed dictionary only lists a
few species so far.
Parameters:
shot – int
EAST shot number, like 85293
valve – str
Like OU1. Also accepts names like OUPEV1 and VOUPEV1; PEV and leading V will be removed.
There are different naming conventions in different contexts and this function tries to
parse them all.
species –
str
Species in the tank, like Ne.
Diluted species are accepted; for example, “50% Ne” will be split at % and give 0.5 of the molecules as Ne
and 0.5 of the molecules as the main ion species (probably D2).
main_ion_species – str
Species of main ions of the plasma, like D
server – str
MDSplus server to use as the source of the data. Intended to allow choice between ‘EAST’ and ‘EAST_US’.
tank_temperature – float
Temperature of the gas in the tank in K
verbose – bool
Print information like gas totals
plot_total – bool
Plot total seeding amount vs time
plot_flow – bool
Plot flow rate (derivative of total seeding) vs time
tsmo – float
Smoothing timescale in seconds, used in plots only. Very important for viewing flow rate.
axs – Axes instance or 1D array of Axes instances
Axes used with plot_total or plot_flow. If none are provided, a new figure will be created.
Number of Axes must be >= plot_flow+plot_total.
Returns:
tuple of 1D arrays and a str
1D array: time in s
1D array: impurity electrons added
1D array: fuel electrons added
1D array: total molecules added
str: primary species added
H98 confinement quality calcuation valid for DIII-D
There are some other H98 calculations for DIII-D:
h_thh98y2 is calcualted by the Transport code and is often regarded as
pretty accurate, but it will turn itself off at the slightest hint of
impurity seeding. If there’s noise in one of the valves (GASB had an issue
in 2021), this will be interpreted as seeding and the calculation will
disable itself. So, it was pretty much always disabled because of this in
years with noisy gas signals. The database that drives the 98y2 scaling goes
up to Zeff=3, so a tiny bit of impurity seeding is tolerable. Even if Zeff
is too high, it might still be interesting to see what H98 would be if it
could be trustworthy.
h98y2_aot is calculated from Automatic OneTwo runs and can overestimate H98.
The AOT calculation is usually availble and doesn’t fail because of impurity
seeding, but it’s probably less accurate than this calculation.
Parameters:
shot – int
calc_tau – bool
Calculate tauth instead of gathering tauth. Might be useful if the
confinement code stops early and doesn’t write tauth.
calc_ptot – bool
Calculate total power instead of gathering it. Useful if ptot isn’t
written due to an error in another code
pinj_only – bool
Approximate ptot by pinj, ignoring ohmic and ECH power.
This might be okay.
estimated_signals – dict
Data for estimating missing signals. Should contain a ‘time’ key whose
value is a 1d float array. Should contain one additional key for each
pointname that is being estimated, whose value is a 1d float array with
length matching that of time.
data – dict [optional]
Pass in an empty dict and data used in the calculation will be written to
it. Intermediate quantities may be captured for inspection in this way.
Returns:
tuple containing two 1D float arrays
Time in ms
H98 as a function of time (unitless)
Attempts to form a reasonable list of likely EFITs based on guesses
Parameters:
scratch_area – dict
Scratch area for storing results to reduce repeat calls. Mainly included to match
call sigure of available_efits_from_rdb(), since OMFITmdsValue already has caching.
device – str
Device name
shot – int
Shot number
default_snap_list – dict [optional]
Default set of EFIT treenames. Newly discovered ones will be added to the list.
**kw –
quietly accepts and ignores other keywords for compatibility with other similar functions
Returns:
(dict, str)
Dictionary keys will be descriptions of the EFITs
Dictionary values will be the formatted identifiers.
For now, the only supported format is just the treename.
If lookup fails, the dictionary will be {‘’: ‘’} or will only contain default results, if any.
String will contain information about the discovered EFITs
Attempts to look up a list of available EFITs using various sources
Parameters:
scratch_area – dict
Scratch area for storing results to reduce repeat calls.
device – str
Device name
shot – int
Shot number
allow_rdb – bool
Allow connection to DIII-D RDB to gather EFIT information (only applicable for select devices)
(First choice for supported devices)
allow_mds – bool
Allow connection to MDSplus to gather EFIT information (only applicable to select devices)
(First choice for non-RDB devices, second choice for devices that normally support RDB)
allow_guess – bool
Allow guesses based on common patterns of EFIT availability on specific devices
(Last resort, only if other options fail)
**kw –
Keywords passed to specific functions. Can include:
param default_snap_list:
dict [optional]
Default set of EFIT treenames. Newly discovered ones will be added to the list.
param format:
str
Instructions for formatting data to make the EFIT tag name.
Provided for compatibility with available_efits_from_rdb() because the only option is ‘{tree}’.
Returns:
(dict, str)
Dictionary keys will be descriptions of the EFITs
Dictionary values will be the formatted identifiers.
If lookup fails, the dictionary will be {‘’: ‘’} or will only contain default results, if any.
String will contain information about the discovered EFITs
This class wraps the line and PollyCollection(s) associated with a banded
errorbar plot for use in the uband function.
Parameters:
line – Line2D
A line of the x,y nominal values
bands – list of PolyCollections
The fill_between and/or fill_betweenx PollyCollections spanning the std_devs of the x,y data
omfit_classes.utils_plot.uband(x, y, ax=None, fill_kwargs=None, **kwargs)[source]¶
Given arguments x,y where either or both have uncertainties, plot x,y using pyplot.plot
of the nominal values and surround it with with a shaded error band using matplotlib’s
fill_between and/or fill_betweenx.
If y or x is more than 1D, it is flattened along every dimension but the last.
Parameters:
x – array of independent axis values
y – array of values with uncertainties, for which shaded error band is plotted
ax – The axes instance into which to plot (default: pyplot.gca())
fill_kwargs – dict. Passed to pyplot.fill_between
**kwargs – Passed to pyplot.plot
Returns:
list. A list of Uband objects containing the line and bands of each (x,y) along
the last dimension.
Utility function to conveniently return the color of an index in a colormap cycle
Parameters:
n – number of uniformly spaced colors, or array defining the colors’ spacings
k – index of the color (if None an array of colors of length n will be returned)
cmap_name – name of the colormap
Returns:
color of index k from colormap cmap_name made of n colors, or array of colors of length n if k is None
Note: if n is an array, then the associated ScalarMappable object is also returned (e.g. for use in a colorbar)
Given a matplotlib color specification or a line2D instance or a list with a line2D instance as the first element,
pick and return a color that will contrast well. More complicated than just inversion as inverting blue gives
yellow, which doesn’t display well on a white background.
Parameters:
line_or_color – matplotlib color spec, line2D instance, or list w/ line2D instance as the first element
Returns:
4 element array
RGBA color specification for a contrasting color
Given a matplotlib color specification or a line2D instance or a list with a line2D instance as the first element,
pick and return a color that will look thematically linked to the first color, but still distinguishable.
Parameters:
line_or_color – matplotlib color spec, line2D instance, or list w/ line2D instance as the first element
Returns:
4 element array
RGBA color specification for a related, similar (but distinguishable) color
omfit_classes.utils_plot.blur_image(im, n, ny=None)[source]¶
blurs the image by convolving with a gaussian kernel of typical
size n. The optional keyword argument ny allows for a different
size in the y direction.
Plots 2D data as a patch collection.
Differently from matplotlib.pyplot.pcolor the mesh is extended by one element so that the number of tiles equals
the number of data points in the Z matrix.
The X,Y grid does not have to be rectangular.
Parameters:
*args – Z or X,Y,Z data to be plotted
fast – bool
Use pcolorfast instead of pcolor. Speed improvements may be dramatic.
However, pcolorfast is marked as experimental and may produce unexpected behavior.
**kwargs – these arguments are passed to matplotlib.pyplot.pclor
returns the veritices of the mesh, if the xdata and ydata were the centers of the mesh
xdata and ydata are 2D matrices, which could for example be generated by np.meshgrid
Plot the various curves defined by the arguments [X],Y,[Z]
where X is the x value, Y is the y value, and Z is the color.
If one argument is given it is interpreted as Y; if two, then X, Y; if three
then X, Y, Z. If all three are given, then it is passed to plotc and the
labels are discarded.
If Z is omitted, a rainbow of colors is used, with blue for the first curve
and red for the last curve. A different color map can be given with the
cmap keyword (see http://wiki.scipy.org/Cookbook/Matplotlib/Show_colormaps
for other options). If X is omitted, then the (ith) index of Y is used as the x value.
This function takes 1 (Z) or 3 (X,Y,Z) 2D tables and interpolates them to higher resolution by bivariate spline interpolation.
If 1 (3) table(s) is(are) provided, then the second(fourth) argument is the resolution increase,
which can be a positive or negative integer integer: res=res0*2^n
or a float which sets the grid size in the units provided by the X and Y tables
classomfit_classes.utils_plot.infoScatter(x, y, annotes, axis=None, tol=5, func=None, all_on=False, suppress_canvas_draw=False, **kw)[source]¶
XKCD plot generator by, Jake Vanderplas; Modified by Sterling Smith
This is a script that will take any matplotlib line diagram, and convert it
to an XKCD-style plot. It will work for plots with line & text elements,
including axes labels and titles (but not axes tick labels).
The idea for this comes from work by Damon McDougall
This adjusts all lines, text, legends, and axes in the figure to look
like xkcd plots. Other plot elements are not modified.
Parameters:
ax – Axes instance
the axes to be modified.
mag – float
the magnitude of the distortion
f3 (f1, f2,) – int, float, int
filtering parameters. f1 gives the size of the window, f2 gives
the high-frequency cutoff, f3 gives the size of the filter
yaxis_log (xaxis_loc,) – float
The locations to draw the x and y axes. If not specified, they
will be drawn from the bottom left of the plot
xaxis_arrow – str
where to draw arrows on the x axes. Options are ‘+’, ‘-’, ‘+-’, or ‘’
yaxis_arrow – str
where to draw arrows on the y axes. Options are ‘+’, ‘-’, ‘+-’, or ‘’
ax_extend – float
How far (fractionally) to extend the drawn axes beyond the original
axes limits
expand_axes – bool
if True, then expand axes to fill the figure (useful if there is only
a single axes in the figure)
ylabel_rot – float
number of degrees to rotate the y axis label
I don’t think this function considers shaded bands such as would be used to display error bars. Increasing the
margin may be a good idea when dealing with such plots.
Parameters:
ax – a matplotlib axes object
margin – The fraction of the total height of the y-data to pad the upper and lower ylims
bool
True if the x xor y axis satisfies all of the following and thus looks like it’s probably a colorbar:
No ticks, no tick labels, no axis label, and range is (0, 1)
fig – Specify a figure instance instead of letting the function pick the most recent one
axes – Specify a plot axes instance or list/array of plot axes instances instead of letting the function use
fig.get_axes()
corner – Which corner does the tag go in? [0, 0] for bottom left, [1, 0] for bottom right, etc.
font_size – Font size of the annotation.
skip_suspected_colorbars – bool
Try to detect axes which are home to colorbars and skip tagging them. An Axes instance is suspected of having a
colorbar if either the xaxis or yaxis satisfies all of these conditions:
- Length of tick list is 0
- Length of tick label list is 0
- Length of axis label is 0
- Axis range is (0,1)
start_at – int
Offset value for skipping some numbers. Useful if you aren’t doing real subfigs, but two separate plots and
placing them next to each other in a publication. Set to 1 to start at (b) instead of (a), for example.
annotate_kw – dict
Additional keywords passed to annotate(). Keywords used by settings such as corner, etc. will be overriden.
Plot 2D or 3D data as line-plots with interactive navigation through the
alternate dimensions. Navigation uses the 4 arrow keys to traverse up to 2 alternate
dimensions.
The data must be on a regular grid, and is formed into a xarray DataArray if not already.
Uses matplotlib line plot for float/int data, OMFIT uerrrorbar for uncertainty variables.
Examples:
The view1d can be used to interactively explore data. For usual arrays it draws line slices.
>> t = np.arange(20)
>> s = np.linspace(0,2*np.pi,60)
>> y = np.sin(np.atleast_2d(s).T+np.atleast_2d(t))
>> da = xarray.DataArray(y,coords=SortedDict([(‘space’,s),(‘time’,t)]),name=’sine’)
>> v = View1d(da.transpose(‘time’,’space’),dim=’space’,time=10)
For uncertainties arrays, it draws errorbars using the uerrorbar function. Multiple views
with the same dimensions can be linked for increased speed (eliminate redundant calls to redraw).
>> y_u = unumpy.uarray(y+(random(y.shape)-0.5),random(y.shape))
>> da_u = xarray.DataArray(y_u,coords=SortedDict([(‘space’,s),(‘time’,t)]),name=’measured’)
>> v_u = View1d(da_u,dim=’space’,time=10,axes=pyplot.gca())
>> v.link(v_u) # v will remain connected to keypress events and drive vu
Variable dependent axis data can be viewed if x and y share a regular grid in some coordinates,
>> x = np.array([s+(random(s.shape)-0.5)*0.2 for i in t]).T
>> da_x = xarray.DataArray(x,coords=SortedDict([(‘space’,s),(‘time’,t)]),name=’varspace’)
>> ds = da_u.to_dataset().merge(da_x.to_dataset())
>> v_x = View1d(ds,name=’measured’,dim=’varspace’,time=10,axes=pyplot.gca())
>> v.link(v_x)
Parameters:
data – DataArray or array-like
2D or 3D data values to be viewed.
coords – dict-like
Dictionary of Coordinate objects that label values along each dimension.
dims – tuple
Dimension names associated with this array.
name – string
Label used in legend. Empty or beginning with ‘_’ produces no legend label.
If the data is a DataArray it will be renamed before plotting.
If the data is a Dataset, the name specifies which of its existing data_vars to plot.
dim – string, DataArray
Dimension plotted on x-axis. If DataArray, must have same dims as data.
axes – Axes instance
The axes plotting is done in.
dynamic_ylim – bool
Re-scale y limits of axes when new slices are plotted.
use_uband – bool
Use uband instead of uerrorbar to plot uncertainties variables.
cornernote_options – dict
Key word arguments passed to cornernote (such as root, shot, device). If this is present, then cornernote
will be updated with the new time if there is only one time to show, or the time will be erased from the
cornernote if more than one time is shown by this View1d instance (such as by freezing one slice).
plot_options – dict
Key word arguments passed to plot/uerrorbar/uband.
**indexers – dict
Dictionary with keys given by dimension names and values given by
arrays of coordinate index values. Must include all dimensions other,
than the fundamental.
Plot 2D data with interactive slice viewers attached to the 2D Axes.
Left clicking on the 2D plot refreshes the line plot slices, right clicking
overplots new slices.
The original design of this viewer was for data on a rectangular grid, for which
x and y are 1D arrays defining the axes but may be irregularly spaced. In this case,
the line plot points correspond to the given data. If x or y is a 2D array,
the data is assumed irregular and interpolated to a regular grid using
scipy.interpolate.griddata.
Example:
Explore a basic 2D np array without labels,
>> x = np.linspace(-1, 1, 200)
>> y = np.linspace(-2, 2, 200)
>> xx, yy = meshgrid(x, y)
>> z = np.exp(-xx**2 - yy**2)
>> v = View2d(z)
To add more meaningful labels to the axes and data do,
>> v = View2d(z, coords={‘x’:x, ‘y’:y}, dims=(‘x’, ‘y’), name=’wow’)
or use a DataArray,
>> d = DataArray(z, coords={‘x’:x, ‘y’:y}, dims=(‘x’, ‘y’), name=’wow’)
>> v = View2d(d)
Note that the coordinates should be 1D. Initializing a view with regular grid,
2D coordinates will result in an attempt to slice them appropriately.
This is done for consistency with some matplotlib 2D plotting routines,
but is not recommended.
>> v = View2d(z, coords=dict(x=x, y=y), dims=(‘x’, ‘y’))
If you have irregularly distributed 2D data, it is recomended that you first interpolate
it to a 2D grid in whatever way is most applicable. If you do not, initializing a view
will result in an attempt to linearly interpolate to a automatically chosen grid.
>> x = np.random.rand(1000)
>> y = np.random.rand(1000) * 2
>> z = np.exp(-x**2 - y**2)
>> v = View2d(z, coords=dict(x=x, y=y), dims=(‘x’, ‘y’))
The same applies for 2D collections of irregular points and values.
>> x = x.reshape((50, 20))
>> y = y.reshape((50, 20))
>> z = z.reshape((50, 20))
>> v = View2d(z, coords=[(‘x’, x), (‘y’, y)], dims=(‘x’, ‘y’))
Parameters:
data – DataArray or array-like
2D or 3D data values to be viewed.
coords – dict-like
Dictionary of Coordinate objects that label values along each dimension.
dims – tuple
Dimension names associated with this array.
name – string
Label used in legend. Empty or begining with ‘_’ produces no legend label.
dim – string, DataArray
Dimension plotted on x-axis. If DataArray, must have same dims as data.
axes – Axes instance
The axes plotting is done in.
quiet – bool
Suppress printed messages.
use_uband – bool
Use uband for 1D slice plots instead of uerrorbar.
contour_levels – int or np.ndarray
Number of or specific levels used to draw black contour lines over the 2D image.
imag_options – dict
Key word arguments passed to the DataArray plot method (modified pcolormesh).
plot_options – dict
Key word arguments passed to plot or uerrorbar. Color will be determined by cmap variable.
**indexers – dict
Dictionary with keys given by dimension names and values given by
arrays of coordinate index values.
if None append shot/time as from OMFIT[‘MainSettings’][‘EXPERIMENT’]
if OMFITmodule append shot/time as from root[‘SETTINGS’][‘EXPERIMENT’]
device – override device string (does not print device at all if empty string)
shot – override shot string (does not print shot at all if empty string)
time – override time string (does not print time at all if empty string)
ax – axis to plot on
fontsize – str or float. Sets font size of the Axes annotate method.
clean – delete existing cornernote(s) from current axes before drawing a new cornernote
remove – delete existing cornernote(s) and return before drawing any new ones
remove_specific – delete existing cornernote(s) from current axes only if text matches the text that would be printed by the current call to cornernote() (such as identical shot, time, etc.)
Create a .Line2D instance with x and y data in sequences of
xdata, ydata.
Additional keyword arguments are .Line2D properties:
Properties:
agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image
alpha: scalar or None
animated: bool
antialiased or aa: bool
clip_box: .Bbox
clip_on: bool
clip_path: Patch or (Path, Transform) or None
color or c: color
dash_capstyle: .CapStyle or {‘butt’, ‘projecting’, ‘round’}
dash_joinstyle: .JoinStyle or {‘miter’, ‘round’, ‘bevel’}
dashes: sequence of floats (on/off ink in points) or (None, None)
data: (2, N) array or two 1D arrays
drawstyle or ds: {‘default’, ‘steps’, ‘steps-pre’, ‘steps-mid’, ‘steps-post’}, default: ‘default’
figure: .Figure
fillstyle: {‘full’, ‘left’, ‘right’, ‘bottom’, ‘top’, ‘none’}
gid: str
in_layout: bool
label: object
linestyle or ls: {‘-’, ‘–’, ‘-.’, ‘:’, ‘’, (offset, on-off-seq), …}
linewidth or lw: float
marker: marker style string, ~.path.Path or ~.markers.MarkerStyle
markeredgecolor or mec: color
markeredgewidth or mew: float
markerfacecolor or mfc: color
markerfacecoloralt or mfcalt: color
markersize or ms: float
markevery: None or int or (int, int) or slice or list[int] or float or (float, float) or list[bool]
path_effects: .AbstractPathEffect
picker: float or callable[[Artist, Event], tuple[bool, dict]]
pickradius: float
rasterized: bool
sketch_params: (scale: float, length: float, randomness: float)
snap: bool or None
solid_capstyle: .CapStyle or {‘butt’, ‘projecting’, ‘round’}
solid_joinstyle: .JoinStyle or {‘miter’, ‘round’, ‘bevel’}
transform: unknown
url: str
visible: bool
xdata: 1D array
ydata: 1D array
zorder: float
See set_linestyle() for a description of the line styles,
set_marker() for a description of the markers, and
set_drawstyle() for a description of the draw styles.
agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image
alpha: scalar or None
animated: bool
antialiased or aa: bool
clip_box: .Bbox
clip_on: bool
clip_path: Patch or (Path, Transform) or None
color or c: color
dash_capstyle: .CapStyle or {‘butt’, ‘projecting’, ‘round’}
dash_joinstyle: .JoinStyle or {‘miter’, ‘round’, ‘bevel’}
dashes: sequence of floats (on/off ink in points) or (None, None)
data: (2, N) array or two 1D arrays
drawstyle or ds: {‘default’, ‘steps’, ‘steps-pre’, ‘steps-mid’, ‘steps-post’}, default: ‘default’
figure: .Figure
fillstyle: {‘full’, ‘left’, ‘right’, ‘bottom’, ‘top’, ‘none’}
gid: str
in_layout: bool
label: object
linestyle or ls: {‘-’, ‘–’, ‘-.’, ‘:’, ‘’, (offset, on-off-seq), …}
linewidth or lw: float
marker: marker style string, ~.path.Path or ~.markers.MarkerStyle
markeredgecolor or mec: color
markeredgewidth or mew: float
markerfacecolor or mfc: color
markerfacecoloralt or mfcalt: color
markersize or ms: float
markevery: None or int or (int, int) or slice or list[int] or float or (float, float) or list[bool]
path_effects: .AbstractPathEffect
picker: float or callable[[Artist, Event], tuple[bool, dict]]
pickradius: float
rasterized: bool
sketch_params: (scale: float, length: float, randomness: float)
snap: bool or None
solid_capstyle: .CapStyle or {‘butt’, ‘projecting’, ‘round’}
solid_joinstyle: .JoinStyle or {‘miter’, ‘round’, ‘bevel’}
transform: .Transform
url: str
visible: bool
xdata: 1D array
ydata: 1D array
zorder: float
Creates a set of subplots in an approximate square, with a few empty subplots if needed
Parameters:
nplots – int
Number of subplots desired
ncol_max – int
Maximum number of columns to allow
flip – bool
True: Puts row 0 at the bottom, so every plot on the bottom row can accept an X axis label
False: Normal plot numbering with row 0 at the top. The bottom row may be sparsely populated.
sparse_column –
bool
Controls the arrangement of empty subplots.
True: the last column is sparse. That is, all the empty plots will be in the last column. There will be at most
one plot missing from the last row, and potentially several from the last column. The advantage is this
provides plenty of X axes on the bottom row to accept labels. To get natural numbering of flattened
subplots, transpose before flattening: axs.T.flatten(), or just use the 1D axsf array that’s returned.
False: the last row is sparse. All the empty plots will be in the last row. The last column will be missing at
most one plot, but the last row may be missing several. This arrangement goes more smoothly with the
numbering of axes after flattening.
just_numbers – bool
Don’t create any axes, but instead just return the number of rows, columns, and empty subplots in the array.
identify – bool
For debugging: write the number (as flattened) and [row, col] coordinates of each subplot on the plot itself.
These go in the center, in black. In the top left corner in red is the naive flattened count, which will appear
on empty plots as well to show how wrong it is. In the bottom right corner in blue is the proper flattened count
based on axsf.
fig – Figure instance [optional]
**kw – keywords passed to pyplot.subplots when creating axes (like sharex, etc.)
Returns:
(axs, axsf) or (nr, nc, on, empty)
axs: 2d array of Axes instances. It is flipped vertically relative to normal axes output by pyplot.subplots,
so the 0th row is the bottom. This is so the bottom row will be populated and can receive x axis labels.
axsf: 1d array of Axes instances, leaving out the empty ones (they might not be in order nicely)
empty: int: number of empty cells in axs.
The first empty, if there is one, is [-1, -1] (top right), then [-1, -2] (top row, 2nd from the right), etc.
nr: int: number of rows
nc: int: number of columns
on: 2d bool array: flags indicating which axes should be on (True) and which should be hidden/off (False)
This function returns the optimal location of the inner-knots for a nth degree spline interpolation of y=f(x)
Parameters:
x – input x array
y – input y array
x0 – initial knots distribution (list) or number of knots (integer)
s – order of the spline
w – input weights array
allKnots – returns all knots or only the central ones exluding the extremes
userFunc – autoknot with user defined function with signature y0=userFunc(x,y)(x0)
minDist – a number between >0 and infinity (though usually <1), which sets the minimum distance between knots.
If small knots will be allowed to be close to one another, if large knots will be equispaced.
Use None to automatically determine this parameter based on: 0.01*len(knots)
If minDist is a string than it will be evaluated (the knots locations in the string can be accessed as knots).
minKnotSpacing – a number in x input units that denotes the minimal inter-knot space that autoknot should
aim for. It shows up as an addition term in the cost function and is heavily punished against. If x is too large
it will force autoknow to output evenly spaced knots. Default to 0. (ie no limit)
Returns:
x0 optimal location of inner knots to spline interpolate y=f(x)
f1=interpolate.LSQUnivariateSpline(x,y,x0,k=s,w=w)
classomfit_classes.utils_fit.knotted_fit_base(x, y, yerr)[source]¶
Bases: object
The base class for the types of fits that have free knots and locations
Does basic checking for x,y,yerr then stores them in
self.x, self.y, self.yerr
such that x is monotonically increasing
Along the way of obtaining the fit with the desired parameters, other
intermediate fits may be obtained. These are stored in the fits
attribute (a dict), the value of whose keys provide an indication of how
the fit was obtained, relative to the starting fit. For instance, to provide
a variable knot fit, a fixed knot (equally spaced) fit is performed first.
Also an initial fit is necessary to know if there are any outliers, and then
the outliers can be detected. The get_best_fit method is useful for
determing which of all of the fits is the best, meaning the valid fit with
the lowest reduced chi^2. Here valid means
the knots are in order
the knots are at least min_dist apart
the errorbars on the fit parameters were able to be determined
the errorbars of the knot locations are smaller than the distance between
the knots
Note that 1) and 2) should be satisfied by using lmfit Parameter constraints,
but it doesn’t hurt to double check :-)
Developer note: If the fitter is always failing to find the errorbars due to
tolerance problems, there are some tolerance keywords that can be passed to
lmfit.minimize: xtol, ftol, gtol
that could be exposed.
Initialize the fitSL object, including calculating the first fit(s)
Parameters:
x – The x values of the data
y – The values of the data
yerr – The errors of the data
Fit Keywords:
Parameters:
knots –
Positive integer: Use this number of knots as default (>=3)
Negative integer: Invoke the fit_knot_range method for the range (3,abs(knots))
list-like: Use this list as the starting point for the knot locations
min_dist – The minimum distance between knot locations
* min_dist > 0 (faster) enforced by construction
* min_dist < 0 (much slower) enforced with lmfit
first_knot – The first knot can be constrained to be above first_knot.
The default is above min(x)+min_dist (The zeroth knot is at 0.)
fixed_knots – If True, do not allow the knot locations to change
fit_SOL – If True, include data points with x>1
monotonic – If True, only allow positive scale lengths
min_slope – Constrain the scale lengths to be above min_slope
outliers – Do an initial fit, then throw out any points a factor
of outliers standard deviations away from the fit
Convenience Keywords:
Parameters:
plot_best – Plot the best fit
allow_no_errorbar – If True, get_best_fit will return the
best fit without errorbars if no valid fit with errorbars exists
lmfit_out – lmfit.MinimizerResult instance to use for getting uncertainties in the curve
omfit_classes.utils_fit.xy_outliers(x, y, cutoff=1.2, return_valid=False)[source]¶
This function returns the index of the outlier x,y data
useful to run before doing a fit of experimental data
to remove outliers. This function works assuming that
the first and the last samples of the x/y data set are
valid data points (i.e. not outliers).
Parameters:
x – x data (e.g. rho)
y – y data (e.g. ne)
cutoff – sensitivity of the cutoff (smaller numbers -> more sensitive [min=1])
return_valid – if False returns the index of the outliers, if True returns the index of the valid data
Returns:
index of outliers or valid data depending on return_valid switch
Uncertainty in the dependent variable data points.
noise_ub: float, optional
Upper bound on a multiplicative factor that will be optimized to infer the most probable systematic
underestimation of uncertainties. Note that this factor is applied over the entire data profile,
although diagnostic uncertainties are expected to be heteroschedastic. Default is 2 (giving
significant freedom to the optimizer).
random_starts: int, optional
Number of random starts for the optimization of the hyperparameters of the GP. Each random
starts begins sampling the posterior distribution in a different way. The optimization that
gives the largest posterior probability is chosen. It is recommended to increase this value
if the fit results difficult. If the regression fails, it might be necessary to vary the
constraints given in the _fit method of the class GPfit2 below, which has been kept rather
general for common usage. Default is 20.
zero_value_outside: bool, optional
Set to True if the profile to be evaluated is expected to go to zero beyond the LCFS, e.g. for
electron temperature and density; note that this option does NOT force the value to be 0 at the LCFS,
but only attempts to constrain the fit to stabilize to 0 well beyond rho=1. Profiles like those of
omega_tor_12C6 and T_12C6 are experimentally observed not to go to 0 at the LCFS, so this option
should be set to False for these. Default is True.
ntanh: integer, optional
Set to 2 if an internal transport barrier is expected. Default is 1 (no ITB expected).
This parameter has NOT been tested recently.
verbose: bool, optional
If set to True, outputs messages from non-linear kernel optimization. Default is False.
Function to conveniently plot the input data and the result of the fit.
Inputs:
———–
Profile: int, optional
Profile to evaluate if more than one has been computed and included in the gp
object. To call the nth profile, set profile=n. If None, it will return an
array of arrays.
Plotting of the raw data and fit with uncertainties
Returns:
None
classomfit_classes.utils_fit.fitLG(x, y, e, d, ng=100, sm=1, nmax=None)[source]¶
Bases: object
This class provides linear fitting of experimental profiles, with gaussian bluring for smoothing.
This procedure was inspired by discussions with David Eldon about the Weighted Average of Interpolations to a Common base (WAIC) technique that he describes in his thesis.
However the implementation here is quite a bit different, in that instead of using a weighted average the median profile is taken, which allows for robust rejection of outliers.
In this implementation the profiles smoothing is obtained by radially perturbing the measurements based on the farthest distance to their neighboring point.
Modified hyperbolic tangent function for fitting pedestal with gaussian function for the fitting of the core.
Stefanikova, E., et al., RewSciInst, 87 (11), Nov 2016
This function is design to fit H-mode density and temeprature profiles as a function of psi_n.
Python based replacement for GAprofiles IDL spline routine “spl_mod”
* Code accepts irregularly spaced (x,y,e) data and returns fit on regularly spaced grid
* Numerical spline procedure based on Numerical Recipes Sec. 3.3 equations 3.3.2, 3.3.4, 3.3.5, 3.3.7
* Auto-knotting uses LMFIT minimization with chosen scheme
* Boundary conditions enforced with matrix elements
The logic of this implementation is as follows:
* The defualt is to auto-knot, which uses least-squares minimization to choose the knot locations.
* If auto-knotting, then there are options of the guess of the knot locations or option to bias the knots. Else manual knots are used and LMFIT is not called
* If the knot guess is present, then it is used else there are two options. Else we use the knot guess.
* If the knot bias is None or >-1 then the knot guess is uniformly distributed using linspace. Else we use linspace with a knot bias.
For the edge data, the logic is as follows:
* We can have auto/manual knots, free/fixed boundary value, and fit/ignore edge data.
* When we fit edge data, then that edge data places a constraint on boundary value.
When monte-carlo is used, the return value is an unumpy uncertainties array that contains the mean and standard-deviation of the monte-carlo trials.
Design Matrix for cubic spline interpolation
Numerical Recipes Sec. 3.3
:param xcore: rho values for data on [0,1]
:param knotloc: knot locations on [0,1]
:param bcs: Dictionary of boundary conditions
:return: design matrix for cubic interpolating spline
Get the spline y-values at the knot locations that best fit the data
:param xdata: x values of measured data
:param ydata: values of measured data
:param wgt: weight of measured data
:param knotloc: location of spline knots [0, …, 1]
:param bcs: dictionary of boundary conditions
:param d, geeinvc, fx, b: Return values from design_matrix
:return: values of the cubic interpolating spline at knot locations that best match the data
Generalized fitter derived from B. Grierson tools
fits core with pow_core polynomial C(x), and
edge with offset exponential of form
E(x) = offset + A*np.exp(-(x - xsym)/edge_width)
blends functions together about x=x_sym with tanh-like behavior
y_fit = (C(x)*np.exp(z) + E(x)*np.exp(-z))/(np.exp(z) + np.exp(-z))
where z = (xsym - x)/blend_width
Parameters:
method – minimization method to use
verbose – turns on details of set flags
onAxis_gradzero – turn on to force C’(0) = 0 (effectively y’(0) for xsym/blend_width >> 1)
onAxis_value – set to force y(0) = onAxis_value
fitEdge – set = False to require E(x) = offset, A=edge_width=0
edge_value – set to force y(x=1) = edge_value
maxiter – controls maximum # of iterations
blend_width_min – minimum value for the core edge blending
edge_width_min – minimum value for the edge
sym_guess – guess for the x location of the pedestal symmetry point
sym_min – constraint for minimum x location for symmetry point
sym_max – constraint for maximum x location for symmetry point
Pos_edge_exp:
force exponential to be positively valued so that exponential will have a negative slope in the SOL
Methods:
__call__(x):
Evaluate mtanh_polyexp at x, propagating correlated uncertainties in the fit parameters.
classomfit_classes.utils_fit.UncertainRBF(x, d, e, centers=None, function='multiquadric', epsilon=None, norm=None)[source]¶
Bases: object
A class for radial basis function fitting of n-dimensional uncertain scattered data
Parameters:
Parameters:
*args – arrays x, y, z, …, d, e where
x, y, z, … are the coordinates of the nodes
d is the array of values at the nodes, and
e is the standard deviation error of the values at the nodes
centers – None the RBFs are centered on the input data points (can be very expensive for large number of nodes points)
-N: N nodes randomly distributed in the domain
N: N*N nodes uniformly distributed in the domain
np.array(N,X): user-defined array with X coordinates of the N nodes
epsilon – float Adjustable constant for gaussian - defaults to approximate average distance between nodes
function – ‘multiquadric’: np.sqrt((r / self.epsilon) ** 2 + 1) #<— default
‘inverse’: 1.0 / np.sqrt((r / self.epsilon) ** 2 + 1)
‘gaussian’: np.exp(-(r**2 / self.epsilon))
‘linear’: r
‘cubic’: r ** 3
‘quintic’: r ** 5
‘thin_plate’: r ** 2 * np.log(r)
norm – default “distance” is the euclidean norm (2-norm)
Given a dictionary with some guesses (can be incomplete or even empty), fills in any missing values with
defaults, makes consistent deltas, and then defines a parameter set.
Produces a parameter instance suitable for input to lmfit .fit() method and stores it as self.guess.
Get the hash for an array.
A hash is an fixed sized integer that identifies a particular value,
comparing this integer is faster than comparing the whole array
(if you already have it stored).
Example:
y1=np.arange(3)y2=np.arange(3)+1assert(get_array_hash(y1)==get_array_hash(y2))# will raise an error
Mimics the Matlab ismember() function to look for occurrences of A into B
Parameters:
A – number or list/array
B – number or list/array
Returns:
returns lia, locb lists.
lia: returns ‘True’ where the data in A is found in B and ‘False’ elsewhere.
locb: contains the lowest index in B for each value in A that is a member of B.
while it contains ‘None’ elsewhere (where A is not a member of B)
This function returns the indices that one must use
to reproduce the step-wise data with the minimum number
of points. In the ascii-art below, it returns the indices
of the crosses. The original data can then be reproduced
by nearest-neighbor interpolation:
x0 – pack points around x0, a float between -1 and 1
p – packing proportional to p factor >0
Returns:
packed points distribution between -1 and 1
omfit_classes.utils_math.simplify_polygon(x, y, tolerance=None, preserve_topology=True)[source]¶
Returns a simplified representation of a polygon
Parameters:
x – array of x coordinates
y – array of y coordinates
tolerance – all points in the simplified object will be within the tolerance distance of the original geometry
if tolerance is None, then a tolerance guess is returned
preserve_topology – by default a slower algorithm is used that preserves topology
Returns:
x and y coordinates of simplified polygon geometry
if tolerance is None, then a tolerance guess is returned
Given a SORTED iterable (a numeric array or list of numbers) and a numeric scalar my_number, find the index of the
number in the list that is closest to my_number
Parameters:
my_list – Sorted iterable (list or array) to search for number closest to my_number
my_number – Number to get close to in my_list
Returns:
Index of my_list element closest to my_number
Note:
If two numbers are equally close, returns the index of the smallest number.
classomfit_classes.utils_math.interp1e(x, y, *args, **kw)[source]¶
Bases: interp1d
Shortcut for scipy.interpolate.interp1d with fill_value=’extrapolate’ and bounds_error=False as defaults
Interpolate a 1-D function.
x and y are arrays of values used to approximate some function f:
y=f(x). This class returns a function whose call method uses
interpolation to find the value of new points.
Parameters:
x : (N,) array_like
A 1-D array of real values.
y(…,N,…) array_like
A N-D array of real values. The length of y along the interpolation
axis must be equal to the length of x.
kindstr or int, optional
Specifies the kind of interpolation as a string or as an integer
specifying the order of the spline interpolator to use.
The string has to be one of ‘linear’, ‘nearest’, ‘nearest-up’, ‘zero’,
‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, or ‘next’. ‘zero’,
‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of
zeroth, first, second or third order; ‘previous’ and ‘next’ simply
return the previous or next value of the point; ‘nearest-up’ and
‘nearest’ differ when interpolating half-integers (e.g. 0.5, 1.5)
in that ‘nearest-up’ rounds up and ‘nearest’ rounds down. Default
is ‘linear’.
axisint, optional
Specifies the axis of y along which to interpolate.
Interpolation defaults to the last axis of y.
copybool, optional
If True, the class makes internal copies of x and y.
If False, references to x and y are used. The default is to copy.
bounds_errorbool, optional
If True, a ValueError is raised any time interpolation is attempted on
a value outside of the range of x (where extrapolation is
necessary). If False, out of bounds values are assigned fill_value.
By default, an error is raised unless fill_value="extrapolate".
fill_valuearray-like or (array-like, array_like) or “extrapolate”, optional
if a ndarray (or float), this value will be used to fill in for
requested points outside of the data range. If not provided, then
the default is NaN. The array-like must broadcast properly to the
dimensions of the non-interpolation axes.
If a two-element tuple, then the first element is used as a
fill value for x_new<x[0] and the second element is used for
x_new>x[-1]. Anything that is not a 2-element tuple (e.g.,
list or ndarray, regardless of shape) is taken to be a single
array-like argument meant to be used for both bounds as
below,above=fill_value,fill_value.
New in version 0.17.0.
If “extrapolate”, then points outside the data range will be
extrapolated.
New in version 0.17.0.
assume_sortedbool, optional
If False, values of x can be in any order and they are sorted first.
If True, x has to be an array of monotonically increasing values.
Attributes:
fill_value
Methods:
__call__
See Also:
splrep, splev
Spline interpolation/smoothing based on FITPACK.
UnivariateSpline : An object-oriented wrapper of the FITPACK routines.
interp2d : 2-D interpolation
Calling interp1d with NaNs present in input values results in
undefined behaviour.
Input values x and y must be convertible to float values like
int or float.
If the values in x are not unique, the resulting behavior is
undefined and specific to the choice of kind, i.e., changing
kind will change the behavior for duplicates.
Examples:
>>> import matplotlib.pyplot as plt
>>> from scipy import interpolate
>>> x = np.arange(0, 10)
>>> y = np.exp(-x/3.0)
>>> f = interpolate.interp1d(x, y)
>>> xnew=np.arange(0,9,0.1)>>> ynew=f(xnew)# use interpolation function returned by `interp1d`>>> plt.plot(x,y,'o',xnew,ynew,'-')>>> plt.show()
Adjusted scipy.interpolate.interp1d (documented below)
to interpolate the nominal_values and std_devs of an uncertainty array.
NOTE: uncertainty in the x data is neglected, only uncertainties in the y data are propagated.
Parameters:
std_pow – float. Uncertainty is raised to this power, interpolated, then lowered.
(Note std_pow=2 interpolates the variance, which is often used in fitting routines).
Additional arguments and key word arguments are as in interp1e (documented below).
Examples:
>> x = np.linspace(0,2*np.pi,30)
>> u = unumpy.uarray(np.cos(x),np.random.rand(len(x)))
>>
>> fi = utils.uinterp1d(x,u,std_pow=2)
>> xnew = np.linspace(x[0],x[-1],1e3)
>> unew = fi(xnew)
>>
>> f = figure(num=’uniterp example’)
>> f.clf()
>> ax = f.use_subplot(111)
>> uerrorbar(x,u)
>> uband(xnew,unew)
interp1e Documentation:
Shortcut for scipy.interpolate.interp1d with fill_value=’extrapolate’ and bounds_error=False as defaults
Interpolate a 1-D function.
x and y are arrays of values used to approximate some function f:
y=f(x). This class returns a function whose call method uses
interpolation to find the value of new points.
Parameters:
x : (N,) array_like
A 1-D array of real values.
y(…,N,…) array_like
A N-D array of real values. The length of y along the interpolation
axis must be equal to the length of x.
kindstr or int, optional
Specifies the kind of interpolation as a string or as an integer
specifying the order of the spline interpolator to use.
The string has to be one of ‘linear’, ‘nearest’, ‘nearest-up’, ‘zero’,
‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, or ‘next’. ‘zero’,
‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of
zeroth, first, second or third order; ‘previous’ and ‘next’ simply
return the previous or next value of the point; ‘nearest-up’ and
‘nearest’ differ when interpolating half-integers (e.g. 0.5, 1.5)
in that ‘nearest-up’ rounds up and ‘nearest’ rounds down. Default
is ‘linear’.
axisint, optional
Specifies the axis of y along which to interpolate.
Interpolation defaults to the last axis of y.
copybool, optional
If True, the class makes internal copies of x and y.
If False, references to x and y are used. The default is to copy.
bounds_errorbool, optional
If True, a ValueError is raised any time interpolation is attempted on
a value outside of the range of x (where extrapolation is
necessary). If False, out of bounds values are assigned fill_value.
By default, an error is raised unless fill_value="extrapolate".
fill_valuearray-like or (array-like, array_like) or “extrapolate”, optional
if a ndarray (or float), this value will be used to fill in for
requested points outside of the data range. If not provided, then
the default is NaN. The array-like must broadcast properly to the
dimensions of the non-interpolation axes.
If a two-element tuple, then the first element is used as a
fill value for x_new<x[0] and the second element is used for
x_new>x[-1]. Anything that is not a 2-element tuple (e.g.,
list or ndarray, regardless of shape) is taken to be a single
array-like argument meant to be used for both bounds as
below,above=fill_value,fill_value.
New in version 0.17.0.
If “extrapolate”, then points outside the data range will be
extrapolated.
New in version 0.17.0.
assume_sortedbool, optional
If False, values of x can be in any order and they are sorted first.
If True, x has to be an array of monotonically increasing values.
Attributes:
fill_value
Methods:
__call__
See Also:
splrep, splev
Spline interpolation/smoothing based on FITPACK.
UnivariateSpline : An object-oriented wrapper of the FITPACK routines.
interp2d : 2-D interpolation
Calling interp1d with NaNs present in input values results in
undefined behaviour.
Input values x and y must be convertible to float values like
int or float.
If the values in x are not unique, the resulting behavior is
undefined and specific to the choice of kind, i.e., changing
kind will change the behavior for duplicates.
Examples:
>>> import matplotlib.pyplot as plt
>>> from scipy import interpolate
>>> x = np.arange(0, 10)
>>> y = np.exp(-x/3.0)
>>> f = interpolate.interp1d(x, y)
>>> xnew=np.arange(0,9,0.1)>>> ynew=f(xnew)# use interpolation function returned by `interp1d`>>> plt.plot(x,y,'o',xnew,ynew,'-')>>> plt.show()
Arguments and key word arguments are as in uinterp1d (documented below).
Uinterp1d Documentation:
Adjusted scipy.interpolate.interp1d (documented below)
to interpolate the nominal_values and std_devs of an uncertainty array.
NOTE: uncertainty in the x data is neglected, only uncertainties in the y data are propagated.
Parameters:
std_pow – float. Uncertainty is raised to this power, interpolated, then lowered.
(Note std_pow=2 interpolates the variance, which is often used in fitting routines).
Additional arguments and key word arguments are as in interp1e (documented below).
Examples:
>> x = np.linspace(0,2*np.pi,30)
>> u = unumpy.uarray(np.cos(x),np.random.rand(len(x)))
>>
>> fi = utils.uinterp1d(x,u,std_pow=2)
>> xnew = np.linspace(x[0],x[-1],1e3)
>> unew = fi(xnew)
>>
>> f = figure(num=’uniterp example’)
>> f.clf()
>> ax = f.use_subplot(111)
>> uerrorbar(x,u)
>> uband(xnew,unew)
interp1e Documentation:
Shortcut for scipy.interpolate.interp1d with fill_value=’extrapolate’ and bounds_error=False as defaults
Interpolate a 1-D function.
x and y are arrays of values used to approximate some function f:
y=f(x). This class returns a function whose call method uses
interpolation to find the value of new points.
Parameters:
x : (N,) array_like
A 1-D array of real values.
y(…,N,…) array_like
A N-D array of real values. The length of y along the interpolation
axis must be equal to the length of x.
kindstr or int, optional
Specifies the kind of interpolation as a string or as an integer
specifying the order of the spline interpolator to use.
The string has to be one of ‘linear’, ‘nearest’, ‘nearest-up’, ‘zero’,
‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, or ‘next’. ‘zero’,
‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of
zeroth, first, second or third order; ‘previous’ and ‘next’ simply
return the previous or next value of the point; ‘nearest-up’ and
‘nearest’ differ when interpolating half-integers (e.g. 0.5, 1.5)
in that ‘nearest-up’ rounds up and ‘nearest’ rounds down. Default
is ‘linear’.
axisint, optional
Specifies the axis of y along which to interpolate.
Interpolation defaults to the last axis of y.
copybool, optional
If True, the class makes internal copies of x and y.
If False, references to x and y are used. The default is to copy.
bounds_errorbool, optional
If True, a ValueError is raised any time interpolation is attempted on
a value outside of the range of x (where extrapolation is
necessary). If False, out of bounds values are assigned fill_value.
By default, an error is raised unless fill_value="extrapolate".
fill_valuearray-like or (array-like, array_like) or “extrapolate”, optional
if a ndarray (or float), this value will be used to fill in for
requested points outside of the data range. If not provided, then
the default is NaN. The array-like must broadcast properly to the
dimensions of the non-interpolation axes.
If a two-element tuple, then the first element is used as a
fill value for x_new<x[0] and the second element is used for
x_new>x[-1]. Anything that is not a 2-element tuple (e.g.,
list or ndarray, regardless of shape) is taken to be a single
array-like argument meant to be used for both bounds as
below,above=fill_value,fill_value.
New in version 0.17.0.
If “extrapolate”, then points outside the data range will be
extrapolated.
New in version 0.17.0.
assume_sortedbool, optional
If False, values of x can be in any order and they are sorted first.
If True, x has to be an array of monotonically increasing values.
Attributes:
fill_value
Methods:
__call__
See Also:
splrep, splev
Spline interpolation/smoothing based on FITPACK.
UnivariateSpline : An object-oriented wrapper of the FITPACK routines.
interp2d : 2-D interpolation
Calling interp1d with NaNs present in input values results in
undefined behaviour.
Input values x and y must be convertible to float values like
int or float.
If the values in x are not unique, the resulting behavior is
undefined and specific to the choice of kind, i.e., changing
kind will change the behavior for duplicates.
Examples:
>>> import matplotlib.pyplot as plt
>>> from scipy import interpolate
>>> x = np.arange(0, 10)
>>> y = np.exp(-x/3.0)
>>> f = interpolate.interp1d(x, y)
>>> xnew=np.arange(0,9,0.1)>>> ynew=f(xnew)# use interpolation function returned by `interp1d`>>> plt.plot(x,y,'o',xnew,ynew,'-')>>> plt.show()
Adjusted scipy.interpolate.RegularGridInterpolator (documented below)
to interpolate the nominal_values and std_devs of an uncertainty array.
Parameters:
std_pow – float. Uncertainty is raised to this power, interpolated, then lowered.
(Note std_pow=2 interpolates the variance, which is often used in fitting routines).
Additional arguments and key word arguments are as in RegularGridInterpolator.
Examples:
Make some sample 2D data
>> x = np.linspace(0,2*np.pi,30)
>> y = np.linspace(0,2*np.pi,30)
>> z = np.cos(x[:,np.newaxis]+y[np.newaxis,:])
>> u = unumpy.uarray(np.cos(x[:,np.newaxis]+y[np.newaxis,:]),np.random.rand(*z.shape))
Form interpolator
>> fi = URegularGridInterpolator((x,y),u,std_pow=2)
Note the interpolated uncertainty between points is curved by std_pow=2. The curve is affected by
the uncertainty of nearby off diagonal points (not shown).
RegularGridInterpolator Documentation:
Interpolation on a regular grid in arbitrary dimensions
The data must be defined on a regular grid; the grid spacing however may be
uneven. Linear and nearest-neighbor interpolation are supported. After
setting up the interpolator object, the interpolation method (linear or
nearest) may be chosen at each evaluation.
Parameters:
points : tuple of ndarray of float, with shapes (m1, ), …, (mn, )
The points defining the regular grid in n dimensions.
valuesarray_like, shape (m1, …, mn, …)
The data on the regular grid in n dimensions.
methodstr, optional
The method of interpolation to perform. Supported are “linear” and
“nearest”. This parameter will become the default for the object’s
__call__ method. Default is “linear”.
bounds_errorbool, optional
If True, when interpolated values are requested outside of the
domain of the input data, a ValueError is raised.
If False, then fill_value is used.
fill_valuenumber, optional
If provided, the value to use for points outside of the
interpolation domain. If None, values outside
the domain are extrapolated.
Methods:
__call__
Notes:
Contrary to LinearNDInterpolator and NearestNDInterpolator, this class
avoids expensive triangulation of the input data by taking advantage of the
regular grid structure.
If any of points have a dimension of size 1, linear interpolation will
return an array of nan values. Nearest-neighbor interpolation will work
as usual in this case.
New in version 0.14.
Examples:
Evaluate a simple example function on the points of a 3-D grid:
This is a convenience function for scipy.integrate.cumtrapz.
Notice that here initial=0 which is what one most often wants, rather than the initial=None,
which is the default for the scipy function.
Cumulatively integrate y(x) using the composite trapezoidal rule.
This is the right way to integrated derivatives quantities which were calculated with gradient.
If a derivative was obtained with the diff command, then the cumsum command should be used for its integration.
Parameters:
y – Values to integrate.
x – The coordinate to integrate along. If None (default), use spacing dx between consecutive elements in y.
dx – Spacing between elements of y. Only used if x is None
:param axis : Specifies the axis to cumulate. Default is -1 (last axis).
:param initialIf given, uses this value as the first value in the returned result.
Typically this value should be 0. If None, then no value at x[0] is returned and the returned array has one element
less than y along the axis of integration.
Returns:
The result of cumulative integration of y along axis. If initial is None, the shape is such that the axis
of integration has one less value than y. If initial is given, the shape is equal to that of y.
This function returns the derivative of the 2nd order lagrange interpolating polynomial of y(x)
When re-integrating, to recover the original values y use cumtrapz(dydx,x)
Function to identify outliers data based on median absolute deviation (mad) distance.
Note: used median absolute deviation defined as 1.4826 * np.median(np.abs(np.median(x)-x))
Parameters:
data – input data array (if a dictionary of arrays, the mad_outliers function is applied to each of the values in the dictionary)
m – mad distance multiplier from the median after which a point is considered an outlier
outliers_or_valid – return valid/outlier points (valid is default)
Returns:
boolean array indicating which data points are a within m mad from the median value (i.e. the valid points)
Function to identify outliers data based on binning of data.
The algorythm bins the data in nbins and then considers valid data only the data that falls within
the bins that have at least mincount counts.
Parameters:
data – input data array (if a dictionary of arrays, the bin_outliers function is applied to each of the values in the dictionary)
mincount – minimum number of counts within a bin for data to be considered as valid
nbins – number of bins for binning of data
outliers_or_valid – return valid/outlier points (valid is default)
Returns:
boolean array indicating which data points are a valid or not
Performs exp(arg) but first limits the value of arg to prevent floating math errors. Checks sys.float_info to so it
can avoid floating overflow or underflow. Can be informed of factors you plan on multiplying with the exponential
result later in order to make the limits more restrictive (the limits are “padded”) and avoid over/underflow later
on in your code.
Parameters:
arg – Argument of exponential function
factor – Factor that might be multiplied in to exponential function later. Adjusts limits to make them more
restrictive and prevent overflow later on. The adjustment to limits is referred to as padding.
minpad – Force the padding to be at least a certain size.
extra_pad – Extra padding beyond what’s determined by minpad and factor
return_big_small – T/F: flag to just return big and small numbers. You may be able to speed up execution in
repeated calls by getting appropriate limits, doing your own cropping, and looping exp() instead of looping
exp_no_overflow(). Even in this case, exp_no_overflow() can help you pick good limits. Or if you don’t have time
for any of that, you can probably use -70 and 70 as the limits on arg, which will get you to order 1e30.
Returns:
exp(arg) with no math errors, or (big, small) if return_big_small is set
Division function to safely compute the ratio of two lists/arrays.
The fill_value input parameter specifies what value should be filled in
for the result whenever the denominator is 0.
Parameters:
numerator – numerator of the division
denominator – denominator of the division
fill_value – fill value when denominator is 0
Returns:
division with fill_value where nan or inf would have been instead
Performs nanargmin along an axis on a 2D array while dodging errors from empty rows or columns
argmin finds the index where the minimum value occurs. nanargmin ignores NaNs while doing this.
However, if nanargmin is operating along just one axis and it encounters a row that’s all NaN, it raises a
ValueError, because there’s no valid index for that row. It can’t insert NaN into the result, either,
because the result should be an integer array (or else it couldn’t be used for indexing), and NaN is a float.
This function is for cases where we would like nanargmin to give valid results where possible and clearly invalid
indices for rows that are all NaN. That is, it returns -N, where N is the row length, if the row is all NaN.
Parameters:
x – 2D float array
Input data to process
axis – int
0 or 1
Returns:
1D int array
indices of the minimum value of each row or column.
Rows/columns which are all NaN will be have -N, where N is the
size of the relevant dimension (so -N is invalid).
omfit_classes.utils_math.calcz(x, y, consistent_reconstruction=True)[source]¶
Calculate Z: the inverse normalized scale-length
The function is coded in such a way to avoid NaN and Inf where y==0
z = -dy/dx/y
Parameters:
x – x axis array
y – y axis array
consistent_reconstruction – calculate z so that
integration of z with integz exactly generates original profile
>> from detect_peaks import detect_peaks
>> x = np.random.randn(100)
>> x[60:81] = np.nan
>> # detect all peaks and plot data
>> ind = detect_peaks(x, show=True)
>> print(ind)
>> x = np.sin(2*np.pi*5*np.linspace(0, 1, 200)) + np.random.randn(200)/5.
>> # set minimum peak height = 0 and minimum peak distance = 20
>> detect_peaks(x, mph=0, mpd=20, show=True)
>> foundwidth = x[indx[2]]-x[indx[0]]
>> print(“Width within {:.2f}%”.format((1-2*halfwidth/foundwidth)*100))
Note we chose M to set the appropriate scale of interest for the above problem.
We can see the sensitivity to this by scanning the tanh width in a 2D example.
M sets the scale for the steepness of interest
using one M for a range of bump sizes isolates only the scale of interest
>> indxs = np.apply_along_axis(find_feature,1,dydxs,x=x,M=0.01,k=5)
Tracking the scale of the bump with M approximates tanh widths
>> #indxs = [find_feature(dy,x=x,M=2*hw/10.,k=5) for dy,hw in zip(dydxs,halfwidths)]
>> # found peak and edge points of steep gradient region
>> foundwidths = map(lambda indx: x[indx[2]]-x[indx[0]], indxs)
>> xpts = [x[i] for i in indxs]
>> ypts = [yy[i] for yy,i in zip(ys,indxs)]
>> dypts= [yy[i] for yy,i in zip(dydxs,indxs)]
>> # tanh half width points
>> # Note np.tanh(1) = 0.76, and d/dxtanh = sech^2 = 0.42
>> ihws = np.array([(np.abs(x-center-hw).argmin(),
>> np.abs(x-center+hw).argmin()) for hw in halfwidths[:,0]])
>> xhws = [x[i] for i in ihws]
>> yhws = [yy[i] for yy,i in zip(ys,ihws)]
>> dyhws= [yy[i] for yy,i in zip(dydxs,ihws)]
Visualize the comparison between tanh widths and the identified region of steep gradient
>> close(‘all’)
>> f,ax = pyplot.subplots(3,1)
>> ax[0].set_ylabel(‘y = np.tanh((c-x)2/w)+1’)
>> ax[0].set_xlabel(‘x’)
>> ax[1].set_ylabel(‘dy/dx’)
>> ax[1].set_xlabel(‘x’)
>> for i in [0,24,49]:
… l, = ax[0].plot(x,ys[i])
… w1, = ax[0].plot(xhws[i],yhws[i],marker=’o’,ls=’’,fillstyle=’none’,color=l.get_color())
… w2, = ax[0].plot(xpts[i],ypts[i],marker=’x’,ls=’’,fillstyle=’none’,color=l.get_color())
… l, = ax[1].plot(x,dydxs[i])
… w1, = ax[1].plot(xhws[i],dyhws[i],marker=’o’,ls=’’,fillstyle=’none’,color=l.get_color())
… w2, = ax[1].plot(xpts[i],dypts[i],marker=’x’,ls=’’,fillstyle=’none’,color=l.get_color())
>> ax[-1].set_ylabel(‘Edge Width’)
>> ax[-1].set_xlabel(‘Tanh Width’)
>> ax[-1].plot([0,2*halfwidths[-1,0]],[0,2*halfwidths[-1,0]])
>> ax[-1].plot(2*halfwidths[:,0],foundwidths,marker=’o’,lw=0)
omfit_classes.utils_math.parabolaMax(x, y, bounded=False)[source]¶
Calculate a parabola through x,y, then return the extremum point of
the parabola
Parameters:
x – At least three abcissae points
y – The corresponding ordinate points
bounded – False, ‘max’, or ‘min’
- False: The extremum is returned regardless of location relative to x (default)
- ‘max’ (‘min’): The extremum location must be within the bounds of x, and if not return the location and value of max(y) (min(y))
Returns:
x_max,y_max - The location and value of the extremum
omfit_classes.utils_math.parabolaCycle(X, Y, ix)[source]¶
omfit_classes.utils_math.parabolaMaxCycle(X, Y, ix, bounded=False)[source]¶
Calculate a parabola through X[ix-1:ix+2],Y[ix-1:ix+2], with proper
wrapping of indices, then return the extremum point of the parabola
Parameters:
X – The abcissae points: an iterable to be treated as periodic
y – The corresponding ordinate points
ix – The index of X about which to find the extremum
bounded – False, ‘max’, or ‘min’
- False: The extremum is returned regardless of location relative to x (default)
- ‘max’ (‘min’): The extremum location must be within the bounds of x, and if not return the location and value of max(y) (min(y))
Returns:
x_max,y_max - The location and value of the extremum
omfit_classes.utils_math.paraboloid(x, y, z)[source]¶
z = ax*x^2 + bx*x + ay*y^2 + by*y + c
NOTE: This function uses only the first 5 points of the x, y, z arrays
to evaluate the paraboloid coefficients
smooth the data using a window with requested size.
This method is based on the convolution of a scaled window with the signal.
The signal is prepared by introducing reflected copies of the signal
(with the window size) in both ends so that transient parts are minimized
in the beginning and end part of the output signal.
input:
param x:
the input signal
param window_len:
the dimension of the smoothing window; should be an odd integer; is ignored if window is an array
param window:
the window function to use, see scipy.signal.get_window documentation for list of available windows
‘flat’ or ‘boxcar’ will produce a moving average smoothing
Can also be an array, in which case it is used as the window function and window_len is ignored
An efficient top hat smoother based on the smooth IDL routine.
The use of cumsum-shift(cumsum) means that execution time
is 2xN flops compared to 2 x n_smooth x N for a convolution.
If supplied with a timebase, the shortened timebase is returned as
the first of a tuple.
Parameters:
data – (timebase, data) is a shorthand way to pass timebase
n_smooth – smooth bin size
timebase – if passed, a tuple with (timebase,smoothed_data) gets returned
causal – If True, the smoothed signal never preceded the input,
otherwise, the smoothed signal is “centred” on the input
(for n_smooth odd) and close (1//2 timestep off) for even
indices – if True, return the timebase indices instead of the times
keep – Better to throw the partially cooked ends away, but if you want to
keep them use keep=True. This is useful for quick filtering
applications so that original and filtered signals are easily
compared without worrying about timebase
Convolution of a non-uniformly discretized array with window function.
The output values are np.nan where no points are found in finite windows (weight is zero).
The gaussian window is infinite in extent, and thus returns values for all xo.
Supports uncertainties arrays.
If the input –does not– have associated uncertainties, then the output will –not– have associated uncertainties.
Parameters:
yi – array_like (…,N,…). Values of input array
xi – array_like (N,). Original grid points of input array (default y indicies)
xo – array_like (M,). Output grid points of convolution array (default xi)
window_size – float.
Width of passed to window function (default maximum xi step).
For the Gaussian, sigma=window_size/4. and the convolution is integrated across +/-4.*sigma.
window_function – str/function.
Accepted strings are ‘hanning’,’bartlett’,’blackman’,’gaussian’, or ‘boxcar’.
Function should accept x and window_size as arguments and return a corresponding weight.
axis – int. Axis of y along which convolution is performed
causal – int. Forces f(x>0) = 0.
interpolate – False or integer number > 0
Paramter indicating to interpolate data so that there are`interpolate`
number of data points within a time window. This is useful in presence of sparse
data, which would result in stair-case output if not interpolated.
The integer value sets the # of points per window size.
std_dev – str/int
Accepted strings are ‘none’, ‘propagate’, ‘population’, ‘expand’, ‘deviation’, ‘variance’.
Only ‘population’ and ‘none’ are valid if yi is not an uncertainties array (i.e. std_devs(yi) is all zeros).
Setting to an integer will convolve the error uncertainties to the std_dev power before taking the std_dev root.
std_dev = ‘propagate’ is true propagation of errors (slow if not interpolating)
std_dev = ‘population’ is the weighted “standard deviation” of the points themselves (strictly correct for the boxcar window)
std_dev = ‘expand’ is propagation of errors weighted by w~1/window_function
std_dev = ‘deviation’ is equivalent to std_dev=1
std_dev = ‘variance’ is equivalent to std_dev=2
Smooth with triangle kernel, designed to mimic smooth in reviewplus
:param args:
y: float array,
y, window_size: float array, float
y, x, window_size: float array, float array, float
OR
y, window_size[optional] as OMFITmdsValue, float
If x or window_size are not defined, smooth_by_convolution will pick values automatically.
window_size is the half width fo the kernel.
The default window_size set by smooth_by_convolution will probably be too small,
unless you are providing a much higher resolution output grid with xo. You should probably set window_size by
providing at least two arguments.
Parameters:
kw – Keywords passed to smooth_by_convolution.
Returns:
float array or uarray or tuple of arrays
ysmo = result from smooth_by_convolution; array (default) or uarray (std_dev is modified)
if OMFITmdsValue:
Convolution of a non-uniformly discretized array with window function.
The output values are np.nan where no points are found in finite windows (weight is zero).
The gaussian window is infinite in extent, and thus returns values for all xo.
Supports uncertainties arrays.
If the input –does not– have associated uncertainties, then the output will –not– have associated uncertainties.
Parameters:
yi – array_like (…,N,…). Values of input array
xi – array_like (N,). Original grid points of input array (default y indicies)
xo – array_like (M,). Output grid points of convolution array (default xi)
window_size – float.
Width of passed to window function (default maximum xi step).
For the Gaussian, sigma=window_size/4. and the convolution is integrated across +/-4.*sigma.
window_function – str/function.
Accepted strings are ‘hanning’,’bartlett’,’blackman’,’gaussian’, or ‘boxcar’.
Function should accept x and window_size as arguments and return a corresponding weight.
axis – int. Axis of y along which convolution is performed
causal – int. Forces f(x>0) = 0.
interpolate – False or integer number > 0
Paramter indicating to interpolate data so that there are`interpolate`
number of data points within a time window. This is useful in presence of sparse
data, which would result in stair-case output if not interpolated.
The integer value sets the # of points per window size.
std_dev – str/int
Accepted strings are ‘none’, ‘propagate’, ‘population’, ‘expand’, ‘deviation’, ‘variance’.
Only ‘population’ and ‘none’ are valid if yi is not an uncertainties array (i.e. std_devs(yi) is all zeros).
Setting to an integer will convolve the error uncertainties to the std_dev power before taking the std_dev root.
std_dev = ‘propagate’ is true propagation of errors (slow if not interpolating)
std_dev = ‘population’ is the weighted “standard deviation” of the points themselves (strictly correct for the boxcar window)
std_dev = ‘expand’ is propagation of errors weighted by w~1/window_function
std_dev = ‘deviation’ is equivalent to std_dev=1
std_dev = ‘variance’ is equivalent to std_dev=2
A class for smoothing ND data. Useful for down-sampling and gridding.
Mean or median filter interpolate in N dimensions.
Calculates the mean or median of all values within a size_scale “sphere” of the interpolation point.
Unlike linear interpolation, you can incorporate information from all your data when down-sampling
to a regular grid.
Of course, it would be better to do a weight function (gaussian, hanning, etc) convolution, but the
“volume” elements required for integration get very computationally expensive in multiple dimensions.
In that case, try linear interpolation followed by regular-grid convolution (ndimage processing).
Parameters:
points – np.ndarray of floats shape (M, D), where D is the number of dimensions
values – np.ndarray of floats shape (M,)
size_scales – tuple (D,) scales of interest in each dimension
std_dev – bool. Estimate uncertainty of mean or median interpolation
filter_type – str. Accepts ‘mean’ or ‘median’
Examples:
This example shows the interpolator is reasonably fast, but note it is nowhere
near as scalable to 100k+ points as linear interpolation + ndimage processing.
This example shows that the median filter is “edge preserving” and contrasts both filters
with a more sophisticated convolution. Note the boxcar convolution is not identical to the
mean filter for low sampling because the integral weights isolated points more than very closely
spaced points.
Propagates errors and calculates weighted standard deviation. While nu_conv
does these for a sliding window vs. time, this function is simpler and does
calculations for a single mean of an array.
Parameters:
x – 1D float array
The input data to be averaged
err – 1D float array
Uncertainty in x. Should have the same units as x. Should have the same length as x.
Special case: a single value of err will be used to propagate errors for the standard
deviation of the mean, but will produce uniform (boring) weights.
If no uncertainty is provided (err==None), then uniform weighting is used.
minerr – float
Put a floor on the uncertainties before calculating weight. This prevents a
few points with unreasonbly low error from stealing the calculation.
Should be a scalar with same units as x.
return_stddev_mean – bool
Flag for whether the standard deviation of the mean (propagated uncertainty)
should be returned with the mean.
return_stddev_pop – bool
Flag for whether the standard deviation of the population (weighted standard deviation)
should be returned with the mean.
nan_policy – str
‘nan’: return NaN if there are NaNs in x or err
‘error’: raise an exception
‘ignore’: perform the calculation on the non-nan elements only (default)
Returns:
float or tuple
mean if return_stddev_mean = False and return_stddev_pop = False
(mean, xpop, xstdm) if return_stddev_mean = True and return_stddev_pop = True
(mean, xpop) if return_stddev_mean = False and return_stddev_pop = True
(mean, xstdm) if return_stddev_mean = True and return_stddev_pop = False
where xstdm and xpop are the propagated error and the weighted standard deviation.
Similar to firFilter, but with a notable difference in edge effects
(butter_smooth may behave better at the edges of an array in some cases).
Parameters:
xx – 1D array
Independent variable. Should be evenly spaced for best results.
yy – array matching dimension of xx
timescale – float
[specifiy either timescale or cutoff]
Smoothing timescale. Units should match xx.
cutoff – float
[specify either timescale or cutoff]
Cutoff frequency. Units should be inverse of xx. (xx in seconds, cutoff in Hz; xx in ms, cutoff in kHz, etc.)
laggy – bool
True: causal filter: smoothed output lags behind input
False: acausal filter: uses information from the future so that smoothed output doesn’t lag input
order – int
Order of butterworth filter.
Lower order filters seem to have a longer tail after the ELM which is helpful for detecting the tail of the ELM.
nan_screen – bool
Perform smoothing on only the non-NaN part of the array, then pad the result out with NaNs to maintain length
btype – string
low or high. For smoothing, always choose low.
You can do a highpass filter instead of lowpass by choosing high, though.
Returns:
array matching dimension of xx
Smoothed version of yy
omfit_classes.utils_math.windowed_FFT(t, y, nfft='auto', overlap=0.95, hanning_window=True, subtract_mean=True, real_input=None)[source]¶
Bin data into windows and compute FFT for each.
Gives amplitude vs. time and frequency.
Useful for making spectrograms.
input:
param t:
1D time vector in ms
param y:
1D parameter as a function of time
param nfft:
Number of points in each FFT bin. More points = higher frequency resolution but lower time
resolution.
param overlap:
Fraction of window that overlaps previous/next window.
param hanning_window:
Apply a Hanning window to y(t) before each FFT.
param subtract_mean:
Subtract the average y value (of the entire data set, not individually per window) before
calculating FFT
param real_input:
T/F: Use rfft instead of fft because all inputs are real numbers.
Set to None to decide automatically.
output:
return:
spectral density (time,frequency)
return:
array of times at the center of each FFT window
return:
frequency
return:
amplitude(time, positive frequency)
return:
power(time, positive frequency)
return:
positive-only frequency array
return:
nfft (helpful if you let it be set automatically)
more on overlap for windows 0, 1, 2:
overlap = 0.0 : no overlap, window 0 ends where window 1 begins
overlap = 0.5 : half overlap, window 0 ends where window 2 begins
overlap = 0.99: this is probably as high as you should go. It will look nice and smooth, but will take longer.
overlap = 1.0 : FAIL, all the windows would stack on top of each other and infinite windows would be required.
more on nfft:
Set nfft=’auto’ and the function will pick a power of two that should give a reasonable view of the dataset.
It won’t choose nfft < 16.
omfit_classes.utils_math.noise_estimator(t, y, cutoff_timescale=None, cutoff_omega=None, window_dt=0, dt_var_thresh=1e-09, avoid_jumps=False, debug=False, debug_plot=None, restrict_cutoff=False)[source]¶
Estimates uncertainty in a signal by assuming that high frequency (above cutoff frequency) variation is random noise
Parameters:
t – 1D float array with regular spacing
(ms) Time base for the input signal
y – 1D float array with length matching t
(arbitrary) Input signal value as a function of t
cutoff_timescale – float
(ms) For a basic RC filter, this would be tau = R*C. Define either this or cutoff_omega.
cutoff_omega – float
(krad/s) The cutoff angular frequency, above which variation is assumed to be noise. Overrides
cutoff_timescale if specified. cutoff_timescale = 1.0/cutoff_omega
window_dt – float or None
(ms) Window half width for windowed_fft, used in some strategies for time varying noise.
If <= 0, one scalar noise estimate is made for the entire range of t, using FFT methods.
Set to None to choose window size automatically based on cutoff_timescale.
This option does not seem to be as good as the standard method.
debug – bool
Flag for activating debugging tests, plots (unless debug_plot is explicitly disabled), and reports.
Also returns all four estimates instead of just the best one.
debug_plot – bool [optional]
By default, debug_plot copies the value of debug, but you can set it to False to disable plots and still get
other debugging features. Setting debug_plot without debug is not supported and will be ignored.
dt_var_thresh – float
(fraction) Threshold for variability in dt. t must be evenly spaced, so nominally dt will be a constant.
Because of various real world issues, there could be some slight variation. As long as this is small, everything
should be fine. You can adjust the threshold manually, but be careful: large variability in the spacing of the
time base breaks assumptions in the signal processing techniques used by this method.
If std(dt)/mean(dt) exceeds this threshold, an exception will be raised.
avoid_jumps – bool
Large jumps in signal will have high frequency components which shouldn’t be counted with high frequency noise.
You SHOULD pick a time range to avoid these jumps while estimating noise, but if you don’t want to, you can try
using this option instead.
If this flag is true, an attempt will be made to identify time periods when jumps are happening. The time
derivative, dy/dt, is evaluated at times where there are no detected jumps and interpolated back onto the full
time base before being integrated to give a new signal. The new signal should have a continuous derivative with
no spikes, such that its high frequency component should now be just noise. This will prevent the high frequency
components of a jump in y from bleeding into the noise estimate near the jump. The noise estimate during the
jump may not be accurate, but you were never going to get a good fix on that, anyway.
The big problem that is solved by this option is that the jump causes spurious spectral noise which extends well
before and after the jump itself. Cutting the jump out somehow confines the problem to the relatively narrow
time range when the jump itself is happening.
restrict_cutoff – bool
Some versions of scipy throw an error if cutoff_frequency > nyquist frequency, and others do not. If your
version hates high frequency cutoffs, set this to True and cutoff will be reduced to nyqusit - df/2.0, where
df is the frequency increment of the FFT, if cutoff >= nyquist.
Returns:
1D uncertain float array with length matching t, or set of four such arrays with different estimates.
Lowpass smoothed y with uncertainty (dimensions and units match input y)
The standard estimate is a hilbert envelope of the high frequency part of the signal, times a constant for
correct normalization:
where smoothing of the envelope is accomplished by a butterworth lowpass filter using cutoff_frequency.
norm_factor=np.sqrt(0.5)=std_dev(sin(x))
There are other estimates (accessible by setting the debug flag) based on the fluctuation amplitude in the
windowed FFT above the cutoff frequency.
Gives a string summarizing array or list contents: either elements [0, 1, 2, …, -3, -2, -1] or all elements
:param a: Array, list, or other compatible iterable to summarize
:return: String with summary of array, or just str(a) for short arrays
Returns dictionary with name, symbol, symbol_A, Z, mass, A, abundance information of all the
elements that match a a given query.
Most of the information was gathered from: http://www.sisweb.com/referenc/source/exactmas.htm
returns dictionary with name, symbol, symbol_A, Z, mass, A, abundance information of all the
elements that match a a given query
Parameters:
symbol_A – string
Atomic symbol followed by the mass number in parenthesis eg. H(2) for Deuterium
symbol –
string
Atomic symbol
can be followed by the mass number eg. H2 for Deuterium
can be prepreceded the mass number and followed by the ion charge number eg. 2H1 for Deuterium
name – string
Long name of the atomic element
Z – int
Atomic number (proton count in nucleus)
Z_ion – int
Charge number of the ion (if not specified, it is assumed Z_ion = Z)
mass – float
Mass of the atomic element in AMU
For matching, it will be easier to use A
A – int
Mass of the atomic element rounded to the closest integer
abundance – float
Abundance of the atomic element as a fraction 0 < abundance <= 1
use_D_T – bool
Whether to use deuterium and tritium for isotopes of hydrogen
return_most_abundant – bool
Whether only the most abundant element should be returned for a query that matches multiple isotopes
Returns:
dictionary with all the elements that match a query
Return a LaTeX formatted string with the element’s symbol, charge state as superscript, and optionally mass number
as subscript.
:param z_n or zn: int
Nuclear charge / atomic number
Parameters:
za (z_a or) – int [optional]
Ionic charge, including any electrons which may be bound.
Defaults to displaying fully-stripped ion if not specified (z_a = z_n).
zamin – int
Minimum for a range of Zs in a bundled charge state (like in SOLPS-ITER)
zamax – int
Minimum for a range of Zs in a bundled charge state (like in SOLPS-ITER)
am (a or) – int [optional]
Mass number. Provides additional filter on results.
Given an atomic element symbol it returns it charge Z
Parameters:
symbol – element symbol
Returns:
element charge Z
omfit_classes.utils_math.splinet(t, y, x, tau)[source]¶
Tension spline evaluator
By VICTOR AGUIAR
NUMERICAL ANALYSIS OF KINCAID
Parameters:
t – nodes location
y – value at the nodes
x – return values at
tau – tension
classomfit_classes.utils_math.CLSQTensionSpline(x, y, t, tau=1, w=None, xbounds=(None,None), min_separation=0, xy_constraints=(), xyprime_constraints=(), optimize_knot_type='xy')[source]¶
Bases: object
Constrained least square tension spline.
Parameters:
x – np.ndarray. Data grid.
y – np.ndarray. Data values.
t – int or np.ndarray. Knot number or locations.
tau – float. Tension (higher is smoother).
w – np.ndarray. Data weights (usually ~1/std_devs)
xbounds – tuple. Minimum and maximum x of knot locations.
min_separation – float. Minumum separation between knot locations.
xy_constraints – list of tuples. Spline is constrained to have these values at these locations.
xyprime_constraints – Spline is constrained to have these derivatives as these locations.
optimize_knot_type – str. choose ‘xy’ to simultaneously optimize knot (x,y) values, ‘x’ to optimize
x and y separately, and ‘y’ to simply use the prescribed knot locations.
Examples:
Using the same data from the LSQUnivariateSpline examples,
>> x = np.linspace(-3, 3, 50)
>> y = np.exp(-x**2) + 0.1 * np.random.randn(50)
We can fit a tension spline. We can even set some boundary constraints.
>> t = [-1, 0, 1]
>> spl = CLSQTensionSpline(x, y, t, tau=0.1, xy_constraints=[(-3,0)])
>> xs = np.linspace(-3, 3, 1000)
>> pyplot.subplots()
>> pyplot.plot(x, y, marker=’o’, lw=0)
>> pyplot.plot(xs, spl(xs))
Note the xknots are optimized by default. We can compare to the un-optimized knot locations,
but (for historical reasons) we wont be able to set constraints.
w – np.ndarray. Data weights (usually ~1/std_devs)
xbounds – tuple. Minimum and maximum x of knot locations.
min_separation – float. Minumum separation between knot locations.
xy_constraints – list of tuples. Spline is constrained to have these values at these locations.
xyprime_constraints – Spline is constrained to have these derivatives as these locations.
optimize_knot_type – str. choose ‘xy’ to simultaneously optimize knot (x,y) values, ‘x’ to optimize
x and y separately, and ‘y’ to simply use the prescribed knot locations.
The figure shows the uncertainties spline more accurately captures the meaningful deviations beyond the errorbars.
In numbers,
>> default_smooth = spl._data[6]
>> default_rchisq = spl.get_residual() / (nx - (len(spl.get_coeffs()) + len(spl.get_knots()) - 2))
>> print(‘Default smoothing is {:.1f} results in reduced chi squared {:.1f}’.format(default_smooth, default_rchisq))
>> print(‘Optimal smoothing of {:.1f} results in reduced chi squared {:.1f}’.format(uspl.get_smoothing_factor(),
… uspl.get_reduced_chisqr()))
If the difference is not large, try running again (the deviations are random!).
To see how the optimizer arrived at the result, you can get the full evolution. Remember, it is targeting
a reduced chi squared of unity.
>> s, f = uspl.get_evolution(norm=False)
>> fig, ax = pyplot.subplots()
>> ax.plot(s, f, marker=’o’, ls=’’) # all points tested
>> ax.plot([uspl.get_smoothing_factor()], [uspl.get_reduced_chisqr() - 1], marker=’s’) # final value
>> ax.set_xscale(‘log’)
>> ax.set_xlabel(‘s’)
>> ax.set_ylabel(‘Reduced chi squared - 1’)
Parameters:
x – np.ndarray. Must be increasing
y – unumpy.uarray. Uncertainties array from uncertainties.unumpy.
w – np.ndarray. Optionally overrides uncertainties from y. Assumed to be 1/std_devs of gaussian errors.
bbox – (2,) array_like. 2-sequence specifying the boundary of the approximation interval.
k – int. Degree of the smoothing spline. Must be 1 <= k <= 5. Default is k=3, a cubic spline.
ext – int or str. Controls the extrapolation mode for elements not in the knot interval. Default 0.
if ext=0 or ‘extrapolate’, return the extrapolated value.
if ext=1 or ‘zeros’, return 0
if ext=2 or ‘raise’, raise a ValueError
if ext=3 of ‘const’, return the boundary value.
check_finite – bool. Whether to check that the input arrays contain only finite numbers.
max_interior_knots – int. Maximum number of interior knots in a successful optimization.
Use this to enforce over smoothing.
Constrained least square univariate spline. This class sacrifices the generality
of UnivariateSpline’s smoothing but enables the ability to constrain values and/or
derivatives of the spline.
The constraints are used in an optimization of the knot locations, not fundamentally
enforced in the underlying equations. Thus, constraints far from the natural spline
will cause errors.
Examples:
Using the same data from the LSQUnivariateSpline examples,
But the new part of this class is that is enables additional constraints on the spline.
For example, we can request the spline have zero derivative at the left boundary.
Initialize a instance of a constrained least square univariate spline.
Parameters:
x – (N,) array_like. Input dimension of data points. Must be increasing
y – (N,) array_like. Input dimension of data points
w – (N,) array_like. Weights for spline fitting. Must be positive. Default is equal weighting.
bbox – (2,) array_like. 2-sequence specifying the boundary of the approximation interval.
k – int. Degree of the smoothing spline. Must be 1 <= k <= 5. Default is k=3, a cubic spline.
ext – int or str. Controls the extrapolation mode for elements not in the knot interval. Default 0.
if ext=0 or ‘extrapolate’, return the extrapolated value.
if ext=1 or ‘zeros’, return 0
if ext=2 or ‘raise’, raise a ValueError
if ext=3 of ‘const’, return the boundary value.
check_finite – bool. Whether to check that the input arrays contain only finite numbers.
t – (M,) array_like or integer. Interior knots of the spline in ascending order (t in
LSQUnivariateSplien) or maximum number of interior knots (max_interior_knots in AutoUnivariateSpline).
optimize_knots – bool. Allow optimizer to change knot locations after initial guess from t or AutoUnivariateSpline.
min_separation – float. Minimum separation between knot locations if not explicitely specified by t.
xy_constraints – list of tuples. Spline is constrained to have these values at these locations.
xyprime_constraints – Spline is constrained to have these derivatives as these locations.
x and y separately, and ‘y’ to simply use the prescribed knot locations.
maxiter – int. Maximum number of iterations for spline coeff optimization under constraints.
Monte Carlo Uncertainty propagation through python spline fits.
The concept follows https://gist.github.com/thriveth/4680e3d3cd2cfe561a57 by Th/oger Rivera-Thorsen (thriveth),
and essentially forms n_trials unique spline instances with randomly perturbed data assuming w=1/std_devs
of gaussian noise.
Note, calling instances of this class returns unumpy.uarrays of Variable objects using the uncertainties package.
Examples:
Using the same data from the LSQUnivariateSpline examples,
Note, this class is a child of the scipy.interpolate.LSQUnivariateSpline class, and
has all of the standard spline class methods. Where appropriate, these methods dig
into the montecarlo trials to return uncertainties. For example,
>> print(‘knots are fixed at {}’.format(splc.get_knots()))
>> print(‘coeffs vary around {}’.format(splc.get_coeffs()))
Initialize a instance of a MonteCarlo constrained least square univariate spline.
Parameters:
x – (N,) array_like. Input dimension of data points. Must be increasing
y – (N,) array_like. Input dimension of data points
w – (N,) array_like. Weights for spline fitting. Must be positive. Default is equal weighting.
bbox – (2,) array_like. 2-sequence specifying the boundary of the approximation interval.
k – int. Degree of the smoothing spline. Must be 1 <= k <= 5. Default is k=3, a cubic spline.
ext – int or str. Controls the extrapolation mode for elements not in the knot interval. Default 0.
if ext=0 or ‘extrapolate’, return the extrapolated value.
if ext=1 or ‘zeros’, return 0
if ext=2 or ‘raise’, raise a ValueError
if ext=3 of ‘const’, return the boundary value.
check_finite – bool. Whether to check that the input arrays contain only finite numbers.
t – (M,) array_like or integer. Interior knots of the spline in ascending order or maximum number
of interior knots.
optimize_knots – bool. Allow optimizer to change knot locations after initial guess from t or AutoUnivariateSpline.
min_separation – float. Minimum separation between knot locations if not explicitely specified by t.
xy_constraints – list of tuples. Spline is constrained to have these values at these locations.
xyprime_constraints – Spline is constrained to have these derivatives as these locations.
x and y separately, and ‘y’ to simply use the prescribed knot locations.
maxiter – int. Maximum number of iterations for spline coeff optimization under constraints.
n_trials – int. Number of Monte Carlo spline iterations used to form errorbars.
Calculate minimum distance from a set of points to a set of line segments.
The segments might be defined by consecutive vertices in a polyline.
The closest point is closest to the SEGMENT, not the line extended to infinity.
The inputs can be arrays or scalars (that get forced into 1 element arrays).
All arrays longer than 1 must have matching length.
If (px, py) and (x1, x2, y1, y2) have the same length, the comparison is done for
(px[0], py[0]) vs (x1[0], y1[0], x2[0], y2[0]). That is, line 0 (x1[0], …) is only
compared to point 0.
All inputs should have matching units.
Parameters:
px – 1D float array-like
X coordinates of test points
py – 1D float array-like
Y coordinates of test points
x1 – 1D float array-like
X-coordinates of the first endpoint of each line segment.
x2 – 1D float array-like
X-coordinates of the second endpoint of each line segment.
y1 – 1D float array-like
Y-coordinates of the first endpoint of each line segment.
y2 – 1D float array-like
Y-coordinates of the second endpoint of each line segment.
return_closest_point – bool
Return the coordinates of the closest points instead of the distances.
Returns:
array or tuple of arrays
if return_closest_point = True:
tuple of two 1D float arrays with the X and Y coordinates of the closest
point on each line segment to each point.
if return_closest_point = False:
1D float array giving the shortest distances between the points and the line segments.
omfit_classes.utils_math.point_in_poly(x, y, poly)[source]¶
Determine if a point is inside a given polygon or not.
Polygon is a list of (x,y) pairs. This function returns True or False.
The algorithm is called the “Ray Casting Method”.
Source: http://geospatialpython.com/2011/01/point-in-polygon.html , retrieved 20160105 18:39
:param x, y: floats
Coordinates of the point to test
Parameters:
poly – List of (x,y) pairs defining a polygon.
Returns:
bool
Flag indicating whether or not the point is within the polygon.
Multidimensional Gaussian filter.
Parameters
———-
%(input)s
sigma : scalar or sequence of scalars
Standard deviation for Gaussian kernel. The standard
deviations of the Gaussian filter are given for each axis as a
sequence, or as a single number, in which case it is equal for
all axes.
orderint or sequence of ints, optional
The order of the filter along each axis is given as a sequence
of integers, or as a single number. An order of 0 corresponds
to convolution with a Gaussian kernel. A positive order
corresponds to convolution with that derivative of a Gaussian.
The multidimensional filter is implemented as a sequence of
one-dimensional convolution filters. The intermediate arrays are
stored in the same data type as the output. Therefore, for output
types with a limited precision, the results may be imprecise
because intermediate results may be stored with insufficient
precision.
Examples
——–
>> from scipy.ndimage import gaussian_filter
>> a = np.arange(50, step=2).reshape((5,5))
>> a
np.array([[ 0, 2, 4, 6, 8],
>> from scipy import misc
>> from matplotlib import pyplot
>> fig = pyplot.figure()
>> pyplot.gray() # show the filtered result in grayscale
>> ax1 = fig.add_subplot(121) # left side
>> ax2 = fig.add_subplot(122) # right side
>> ascent = misc.ascent()
>> result = gaussian_filter(ascent, sigma=5)
>> ax1.imshow(ascent)
>> ax2.imshow(result)
>> pyplot.show()
Here is a nice little demo of the added OMFIT causal feature,
s2 – ndarray. Time series of measurement values (default of None uses s1)
fs – Sampling frequency of the x time series. Defaults to 1.0.
nperseg – int. Length of each segment. Defaults to None, but if window is str or tuple, is set to 256, and if window is array_like, is set to the length of the window.
noverlap – int. Number of points to overlap between segments. If None, noverlap = nperseg // 8. Defaults to None.
kwargs – All additional key word arguments are passed to signal.spectrogram
Return f, bicoherence:
array of frequencies and matrix of bicoherence at those frequencies
Open file and return a stream. Raise OSError upon failure.
file is either a text or byte string giving the name (and the path
if the file isn’t in the current working directory) of the file to
be opened or an integer file descriptor of the file to be
wrapped. (If a file descriptor is given, it is closed when the
returned I/O object is closed, unless closefd is set to False.)
mode is an optional string that specifies the mode in which the file
is opened. It defaults to ‘r’ which means open for reading in text
mode. Other common values are ‘w’ for writing (truncating the file if
it already exists), ‘x’ for creating and writing to a new file, and
‘a’ for appending (which on some Unix systems, means that all writes
append to the end of the file regardless of the current seek position).
In text mode, if encoding is not specified the encoding used is platform
dependent: locale.getpreferredencoding(False) is called to get the
current locale encoding. (For reading and writing raw bytes use binary
mode and leave encoding unspecified.) The available modes are:
Character
Meaning
‘r’
open for reading (default)
‘w’
open for writing, truncating the file first
‘x’
create a new file and open it for writing
‘a’
open for writing, appending to the end of the file if it exists
‘b’
binary mode
‘t’
text mode (default)
‘+’
open a disk file for updating (reading and writing)
‘U’
universal newline mode (deprecated)
The default mode is ‘rt’ (open for reading text). For binary random
access, the mode ‘w+b’ opens and truncates the file to 0 bytes, while
‘r+b’ opens the file without truncation. The ‘x’ mode implies ‘w’ and
raises an FileExistsError if the file already exists.
Python distinguishes between files opened in binary and text modes,
even when the underlying operating system doesn’t. Files opened in
binary mode (appending ‘b’ to the mode argument) return contents as
bytes objects without any decoding. In text mode (the default, or when
‘t’ is appended to the mode argument), the contents of the file are
returned as strings, the bytes having been first decoded using a
platform-dependent encoding or using the specified encoding if given.
‘U’ mode is deprecated and will raise an exception in future versions
of Python. It has no effect in Python 3. Use newline to control
universal newlines mode.
buffering is an optional integer used to set the buffering policy.
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
line buffering (only usable in text mode), and an integer > 1 to indicate
the size of a fixed-size chunk buffer. When no buffering argument is
given, the default buffering policy works as follows:
Binary files are buffered in fixed-size chunks; the size of the buffer
is chosen using a heuristic trying to determine the underlying device’s
“block size” and falling back on io.DEFAULT_BUFFER_SIZE.
On many systems, the buffer will typically be 4096 or 8192 bytes long.
“Interactive” text files (files for which isatty() returns True)
use line buffering. Other text files use the policy described above
for binary files.
encoding is the name of the encoding used to decode or encode the
file. This should only be used in text mode. The default encoding is
platform dependent, but any encoding supported by Python can be
passed. See the codecs module for the list of supported encodings.
errors is an optional string that specifies how encoding errors are to
be handled—this argument should not be used in binary mode. Pass
‘strict’ to raise a ValueError exception if there is an encoding error
(the default of None has the same effect), or pass ‘ignore’ to ignore
errors. (Note that ignoring encoding errors can lead to data loss.)
See the documentation for codecs.register or run ‘help(codecs.Codec)’
for a list of the permitted encoding error strings.
newline controls how universal newlines works (it only applies to text
mode). It can be None, ‘’, ‘n’, ‘r’, and ‘rn’. It works as
follows:
On input, if newline is None, universal newlines mode is
enabled. Lines in the input can end in ‘n’, ‘r’, or ‘rn’, and
these are translated into ‘n’ before being returned to the
caller. If it is ‘’, universal newline mode is enabled, but line
endings are returned to the caller untranslated. If it has any of
the other legal values, input lines are only terminated by the given
string, and the line ending is returned to the caller untranslated.
On output, if newline is None, any ‘n’ characters written are
translated to the system default line separator, os.linesep. If
newline is ‘’ or ‘n’, no translation takes place. If newline is any
of the other legal values, any ‘n’ characters written are translated
to the given string.
If closefd is False, the underlying file descriptor will be kept open
when the file is closed. This does not work when a file name is given
and must be True in that case.
A custom opener can be used by passing a callable as opener. The
underlying file descriptor for the file object is then obtained by
calling opener with (file, flags). opener must return an open
file descriptor (passing os.open as opener results in functionality
similar to passing None).
open() returns a file object whose type depends on the mode, and
through which the standard file operations such as reading and writing
are performed. When open() is used to open a file in a text mode (‘w’,
‘r’, ‘wt’, ‘rt’, etc.), it returns a TextIOWrapper. When used to open
a file in a binary mode, the returned class varies: in read binary
mode, it returns a BufferedReader; in write binary and append binary
modes, it returns a BufferedWriter, and in read/write mode, it returns
a BufferedRandom.
It is also possible to use a string or bytearray as a file for both
reading and writing. For strings StringIO can be used like a file
opened in a text mode, and for bytes a BytesIO can be used like a file
opened in a binary mode.
checks if an object is of a certain type by looking at the class name (not the class object)
This is useful to circumvent the need to load import Python modules.
Parameters:
inv – object of which to check the class
cls – string or list of string with the name of the class(es) to be checked
Return the object that dynamic expressions return when evaluated
This allows OMFITexpression(‘None’) is None to work as one would expect.
Epxressions that are invalid they will raise an OMFITexception when evaluated
Parameters:
inv – input object
Returns:
If inv was a dynamic expression, returns the object that dynamic expressions return when evaluated
This is a convenience function to evaluate if a object or an expression is None
Use of this function is preferred over testing if an expression is None
by using the == function. This is because np arrays evaluate == on a per item base
Function to print with DEBUG style.
Printing is done based on environmental variable OMFIT_DEBUG
which can either be a string with an integer (to indicating a debug level)
or a string with a debug topic as defined in OMFITaux[‘debug_logs’]
Parameters:
*objects – what to print
level – minimum value of debug for which printing will occur
Load and initialize a module implemented as a dynamically loadable shared library and return its module object.
If the module was already initialized, it will be initialized again.
Re-initialization involves copying the __dict__ attribute of the cached instance of the module over the value used in the module cached in sys.modules.
Note: using shared libraries is highly system dependent, and not all systems support it.
Parameters:
name – name used to construct the name of the initialization function: an external C function called initname() in the shared library is called
This function sets up the remote ssh tunnel (if necessary) to connect to the server
and returns the username,server,port triplet onto which to connect.
Parameters:
server – string with remote server
tunnel – string with via tunnel (multi-hop tunneling with comma separated list of hops)
forceTunnel – force tunneling even if server is directly reachable (this is useful when servers check provenance of an IP)
forceRemote – force remote connection even if server is localhost
allowEmptyServerUsername – allow empty server username
Specifies a local “dynamic” application-level port forwarding.
Whenever a connection is made to a defined port on the local side, the connection is forwarded over the secure channel, and the application protocol is then used to determine where to connect to from the remote machine.
The SOCKS4 and SOCKS5 protocols are supported, and ssh will act as a SOCKS server.
Converts a function to an OMFITpythonTask instance that can be saved in the tree
Parameters:
funct – function
The function you want to export
self_ref – object
Reference to the object that would be called self within the script.
Its location in the tree will be looked up and used to replace ‘self’ in the code.
This is used to add a line defining the variable self within the new OMFITpythonTask’s source. If the function
doesn’t use self, then it just has to be something that won’t throw an exception, since it won’t be used
(e.g. self_ref=OMFIT should work if you’re not using self)
Finds which list element is most common (most useful for a list of strings or mixed strings & numbers)
:param input_list: list with hashable elements (no nested lists)
:return: The list element with the most occurrences. In a tie, one of the winning elements is picked arbitrarily.
Select keys from a dictionary of dictionaries. This is useful to select data from a dictionary that uses a hash
as the key for it’s children dictionaries, and the hash is based on the content of the children.
Flips values and keys of a dictionary
People sometime search the help for swap_keys, switch_keys, or flip_keys to find this function.
Parameters:
dictionary – dict
input dictionary to be processed
modify_original – bool
whether the original dictionary should be modified
add_key_to_value_first – bool
Append the original key to the value (which will become the new key).
The new dictionary will look like: {‘value (key)’: key},
where key and value were the original key and value.
This will force the new key to be a string.
Get a version similar to the results of self.describe, but restricted to tags containing a specific string. This
is for finding the tagged version of a module: one might use repo.get_tag_version('cake_') to get the
version of the CAKE module (like ‘cake_00.01.684e4d226a’ or ‘cake_00.01’)
Parameters:
tag_family – A substring defining a family of tags. It is assumed that splitting this substring out of the
git tags will leave behind version numbers for that family.
Returns:
A string with the most recent tag in the family, followed by a commit short hash if there have been
commits since the tag was defined.
Clone of the repository in the OMFITworking environment OMFITtmpDir+os.sep+’repos’
and maintain remotes information. Note: original_git_repository is the remote that points
to the original repository that was cloned
Hash a string using SHA1 and truncate hash at given length
Use this function instead of Python hash(string) since with
Python 3 the seed used for hashing changes between Python sessions
Parameters:
string – input string to be hashed
length – lenght of the hash (max 40)
Returns:
SHA1 hash of the string in hexadecimal representation
Hash a string using SHA1 and truncate integer at given length
Use this function instead of Python hash(string) since with
Python 3 the seed used for hashing changes between Python sessions
Go through the source directory, create any directories that do not already exist in destination directory,
and move files from source to the destination directory
Any pre-existing files will be removed first (via os.remove) before being replace by the corresponding source file.
Any files or directories that already exist in the destination but not in the source will remain untouched
Patch a standard module/class with a new function/method.
Moves original attribute to _original_<name> ONLY ONCE! If done
blindly you will go recursive when reloading modules
Makes up a random name with no spaces in it. Funnier than timestamps.
Parameters:
use_mood – bool
Use a mood instead of a color
digits – int
Number of digits in the random number (default: 2)
Returns:
string
The default format is [color]_[animal]_[container]_[two digit number]
Example: “blueviolet_kangaroo_prison_26”
Colors come from matplotlib’s list.
Alternative formats selected by keywords:
[mood]_[animal]_[container]_[two digit number]
Example: “menacing_guppy_pen_85”
Calls the OMFITmodule.convert_to_developer_mode() method for every top-level module in OMFIT
Intended for convenient access in a command box script that can be called as soon as the project is reloaded.
Accepts keywords related to OMFITmodule.convert_to_developer_mode() and passes them to that function.
Parameters:
module_link – OMFITmodule or OMFIT instance
Refence to the module to be converted or to top-level OMFIT
If None, defaults to OMFIT (affecting all top-level modules)
Compares two version numbers and determines which one, if any, is greater.
This function can handle wildcards (eg. 1.1.*)
Most non-numeric characters are removed, but some are given special treatment.
a, b, c represent alpha, beta, and candidate versions and are replaced by numbers -3, -2, -1.
So 4.0.1-a turns into 4.0.1.-3, 4.0.1-b turns into 4.0.1.-2, and then -3 < -2
so the beta will be recognized as newer than the alpha version.
rc# is recognized as a release candidate that is older than the version without the rc
So 4.0.1_rc1 turns into 4.0.1.-1.1 which is older than 4.0.1 because 4.0.1 implies 4.0.1.0.0.
Also 4.0.1_rc2 is newer than 4.0.1_rc1.
Parameters:
version1 – str
First version to compare
version2 – str
Second version to compare
Returns:
int
1 if version1 > version2
-1 if version1 < version2
0 if version1 == version2
0 if wildcards allow version ranges to overlay. E.g. 4.* vs. 4.1.5 returns 0 (equal)
Given a list of strings with version numbers like 1.2.12, 1.2, 1.20.5, 1.2.3.4.5, etc., find the maximum version
number. Test with: print(repo.get_tag_version(‘v’))
Parameters:
versions – List of strings like [‘1.1’, ‘1.2’, ‘1.12’, ‘1.1.13’]
Returns:
A string from the list of versions corresponding to the maximum version number.
Tk requires ‘ ‘, ‘', ‘{’ and ‘}’ characters to be escaped
Use of this function dependes on the system and the behaviour can be
set on a system-by-system basis using the OMFIT_ESCAPE_TK_SPACES environmental variable
Note: by using this function tk does not manage the GUI geometry
which is beneficial when switching between desktop on some window
managers, which would otherwise re-center all windows to the desktop.
Parameters:
win – window to be centered
parent – window with respect to be centered (center with respect to screen if None)
xoff – x offset in pixels
yoff – y offset in pixels
allow_negative – whether window can be off the left/upper part of the screen
This function is useful when plugging in a new display in OSX and the OMFIT window disappear
To fix the issue, go to the XQuartz application and select the OMFIT window from the menu Window > OMFIT …
Then press F8 a few times until the OMFIT GUI appears on one of the screens.
This function must return the contents of the selection.
The function will be called with the arguments OFFSET and LENGTH which allows the chunking of very long selections.
The following keyword parameters can be provided: selection - name of the selection (default PRIMARY), type - type of the selection (e.g. STRING, FILE_NAME).
Parameters:
offset – allows the chunking of very long selections
length – allows the chunking of very long selections
selection – name of the selection (default set by $OMFIT_CLIPBOARD_SELECTION)
type – type of the selection (default set by $OMFIT_CLIPBOARD_TYPE)
The purpose of this class is to have a single point of interaction with a
given license. After the License is initialized, then it should be checked
when the given code (or member of a suite) is to be run.
All licenses are stored in $HOME/.LICENCES/<codename>
Parameters:
codename – The name of the code (or suite) to which the license applies
fn – The location (filename) of the software license
email_dict –
(optional) At least two members
email_address - The address(es) to which the email should be sent,
if multiple, as a list
email_list - A message describing any lists to which the user should
be added for email notifications or discussion
rootGUI – tkInter parent widget
web_address – (optional) A URL for looking up the code license
Using the basemap matplotlib toolkit, this function generates a map and
puts a markers at the location of every latitude and longitude found in the list
Parameters:
lats – list of latitude floats
lons – list of longitude floats
wesn – list of 4 floats to clip map west, east, south, north
Processes a function docstring so it can be used as the help tooltip for a GUI element without looking awkward.
Protip: you can test this function on its own docstring.
Example usage:
defmy_function():''' This docstring would look weird as a help tooltip if used directly (as in help=my_function.__doc__). The line breaks and indentation after line breaks will not be appropriate. Also, a long line like this will be broken automatically when put into a help tooltip, but it won't be indented. However, putting it through clean_docstring_for_help() will solve all these problems and the re-formatted text will look better. '''print('ran my_function')return0OMFITx.Button("Run my_function",my_function,help=clean_docstring_for_help(my_function))
Parameters:
string_or_function_in – The string to process, expected to be either the string stored in function.__doc__
or just the function itself (from which .__doc__ will be read). Also works with OMFITpythonTask, OMFITpythonGUI,
and OMFITpythonPlot instances as input.
remove_params – T/F: Only keep the docstring up to the first instance of “ :param “ or “ :return “ as
extended information about parameters might not fit well in a help tooltip.
remove_deep_indentation – T/F: True: Remove all of the spaces between a line break and the next non-space
character and replace with a single space. False: Remove exactly n spaces after a line break, where n is the
indentation of the first line (standard dedent behavior).
Returns:
Cleaned up string without spacing or line breaks that might look awkward in a GUI tooltip.
This is a utility function to list the methods that need to be defined
for a class to behave like a numeric type in python 3. This used to be
done by the __coerce__ method in Python 2, but that’s no more available.
Parameters:
binary_function – string to be used for binary operations
unary_function – string to be used for unary operations
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
A branch in the tree is represented in the filesystem as a directory.
Note that the OMFIT object itself belongs to the OMFITmainTree class,
which is a subclass of the OMFITtree class.
Parameters:
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
Similarly to the duplicate method for OMFITobjects, this method makes a copy by files.
This means that the returned subtree objects will be pointing to different files from the one of the original object.
This is to be contrasted to a deepcopy of an object, which copies the objects in memory, but does not duplicate the objects themselves.
Parameters:
filename – if filename=’’ then the duplicated subtree and its files will live in the OMFIT working directory,
if filename=’directory/OMFITsave.txt’ then the duplicated subtree and its files will live in directory specified
modifyOriginal – only if filename!=’’
by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – only if filename!=’’
will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
Returns:
new subtree, with objects pointing to different files from the one of the original object
NOTE: readOnly+modifyOriginal is useful because one can get significant read (modifyOriginal) and write (readOnly) speed-ups,
but this feature relies on the users pledging they will not modify the content under this subtree.
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
lazyLoad – enable/disable lazy load of picklefiles and xarrays
selector – sets selection of a specific realization (None for all realizations, ‘random’ for a random realization). This can also be an OMFITexpression.
strategy – sets operation to be performed on all realizations (if .selector == None)
raise_errors –
sets how to proceed in case of errors (eg. missing objects, attributes)
None: print warning on errors
True: raise errors
False: ignore errors
no_subdir –
This keyword affects how the OMFITcollection is deployed.
False (default) the OMFITcollection is deployed like a normal OMFITtree, such that the files of the
collection are deployed under a subdirectory, whose name comes from the entry in the tree
True the files are deployed in the same level directory as the parent
.selector, .strategy, .raise_errors can be modified after the object is instantiated
>> tmp=OMFITcollection(raise_errors=True)
>> for k in range(1,10):
>> tmp[k]={}
>> tmp[k][‘hello’]=np.linspace(0,1,100)**k
>>
>> #return a single realization
>> tmp.selector=5
>> tmp.strategy=None
>> pyplot.plot(tmp[‘hello’],’–r’,lw=2)
>>
>> #return all realizations
>> tmp.selector=None
>> tmp.strategy=None
>> plotc(tmp[‘hello’])
>>
>> #attribute on all realizations
>> tmp.selector=None
>> tmp.strategy=None
>> print(tmp[‘hello’].mean())
>>
>> #perform operation on all realizations
>> tmp.selector=None
>> tmp.strategy=’np.mean(x,axis=1)’
>> pyplot.plot(tmp[‘hello’],’k–‘,lw=2)
>>
>> OMFIT[‘variable’]=3
>> tmp.selector=OMFITexpression(“OMFIT[‘variable’]”)
>> print(tmp)
>>
>> print(tmp.GET(3)[‘hello’]-tmp[‘hello’])
>>
>> # to update values, you can use the UPDATE method
>> tmp.UPDATE(location=”[‘hello’][0]”,values=-1e6)
>> tmp.selector=None
>> tmp.strategy=None
>> print(tmp[‘hello’].mean())
Parameters:
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
Writes the key entry in the collection dictionary, regardless of the value of the .selector.
This is equivalent to self[key]=value when self.selector=None.
Returns the key entry from the collection dictionary, regardless of the value of the .selector.
This is equivalent to self[key] when self.selector=None.
Returns whether key is in the collection dictionary, regardless of the value of the .selector.
This is equivalent to key in self when self.selector=None.
A class for holding results from a Monte Carlo set of runs
Effectively this is a OMFITcollection class with default strategy set to: uarray(np.mean(x,1),np.std(x,1))
Parameters:
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
filename – None: create new module from skeleton, ‘’: create an empty module
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
Method to store a snapshot of the current module status and save it under self[‘__STORAGE__’][runid]
where runid is set under self[‘SETTINGS’][‘EXPERIMENT’][‘runid’]
Parameters:
runid – runid to be de-stored. If None the runid is taken from self[‘SETTINGS’][‘EXPERIMENT’][‘runid’]
return_associated_git_branch – whether to return just the path of each directory or also the remote/branch info.
This requires parsing of the OMFIT modules in a directory and it can be quite slow,
however the info is buffered, so later accesses are faster.
separator – text to use to separate the path and the remote/branch info
checkIsWriteable – checks if user has write access.
Note: if checkIsWriteable=’git’ will return a directory even if it is not writable, but it is a git repository
directories – list of directories to check. If None the list of directories is taken from OMFIT[‘MainSettings’][‘SETUP’][‘modulesDir’]
Method that resolves the OMFITexpressions that are found in a module
root[‘SETTINGS’][‘EXPERIMENT’] and returns the location that
those expressions points to.
Params *args:
list of keys to return the absolute location of
Returns:
dictionary with the absolute location the expressions point to
method used to set the value of the entries under root[‘SETTINGS’][‘EXPERIMENT’]
This method resolves the OMFITexpressions that are found in a module
root[‘SETTINGS’][‘EXPERIMENT’] and sets the value at the location
that those expressions points to.
Same class of SortedDict, but is not saved when a project is saved
This class is used to define the __scratch__ space under each module as well as the global OMFIT[‘scratch’]
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
Returns dictionary with the saved project information.
Parameters:
filename – filename of the project to return info about.
If filename=’’ then returns info about the current project.
Note that projects information are updated only when the project is saved.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
This class handles so called OMFIT dynamic expressions.
If you generate dynamic expressions in a Python script, note that the relative location of the expression
(root,parent,DEPENDENCIES,…) is evaluated with respect to where the expression is in the tree,
not relative to where the script which generated the expression resides
If you are using relative locations in your expression, things may not work if you have the same expression into
two locations in the tree. A classic of this happening is if you do a memory copy of something (e.g. namelist) containing
an expression to some other location in the tree. If this happens the results are unpredictable.
Parameters:
expression – string containing python code to be dynamically evaluated every time an attribute of this object is accessed
Subclass of OMFITexpression used for iterable objects
The distinction between iterable and not iterable expressions
is used in case someone tests for iterability of an object
This function provides a dictionary references to some useful quantities with respect to the object specified in location.
Note that the variables in the returned dictionary are the same ones that are available within the namespace of OMFIT scripts and expressions.
Parameters:
location – location in the OMFIT tree
Returns:
dictionary containing the following variables:
OMFITlocation : list of references to the tree items that compose the OMFIT-tree path
OMFITlocationName : list of path strings to the tree items that compose the OMFIT-tree path
parent : reference to the parent object
parentName : path string to the parent object
this : reference to the current object
thisName : path string to the current object
OMFITmodules : list of modules to the current module
OMFITmodulesName : list of string paths to the current module
key – function that returns a string that is used for sorting or dictionary key whose content is used for sorting
>> tmp=SortedDict()
>> for k in range(5):
>> tmp[k]={}
>> tmp[k][‘a’]=4-k
>> # by dictionary key
>> tmp.sort(key=’a’)
>> # or equivalently
>> tmp.sort(key=lambda x:tmp[x][‘a’])
Parameters:
**kw – additional keywords passed to the underlying list sort command
Return the object that dynamic expressions return when evaluated
This allows OMFITexpression(‘None’) is None to work as one would expect.
Epxressions that are invalid they will raise an OMFITexception when evaluated
Parameters:
inv – input object
Returns:
If inv was a dynamic expression, returns the object that dynamic expressions return when evaluated
Class used to parse bibtex files
The class should be saved as a dictionary or dictionaries (one dictionary for each bibtex entry)
Each bibtex entry must have defined the keys: ENTRYTYPE and ID
Parameters:
filename – filename of the .bib file to parse
**kw – keyword dictionary passed to OMFITascii class
To generate list of own publications:
1. Export all of your citations from https://scholar.google.com to a citation.bib bibtex file
2. OMFIT[‘bib’]=OMFITbibtex(‘.bib’) # load citations as OMFITbibtex
3. OMFIT[‘bib’].doi(deleteNoDOI=True) # remove entries which do not have a DOI (ie.conferences)
4. OMFIT[‘bib’].sanitize() # fix entries where needed
4. OMFIT[‘bib’].update_ID(as_author=’Meneghini’) # Sort entries and distinguish between first author or contributed
5. print(’nn’.join(OMFIT[‘bib’].write_format())) # write to whatever format desired
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Principal data class used by _Detachment_Indicator.
Data is fetched as a MDSplus pointname, truncated, ELM filtered, smoothed and remapped to an independent
variable. See below for inherited objects with specific implementation needs (eg langmuir probes, DTS …)
Parameters:
tag – string
a keyword name for this signal. Eg. pointname of MDSplus siganl
A DI_signal doesn’t need to know the details of the ELM identification that is identified
globally for a shot and then shared out to a variety of signals/diagnostics for processing.
As such, the ELM filtering is done here using a pre-defined omfit_elm object.
Parameters:
filter_settings – Dict
dict containing the values to be passed to ELM filtering. See OMFITelm for specification
DTS specific methods. Nothing fancy here, just a parsing of the DTS arrays stored in MDSplus. It can’t be done
with DI_signal as the MDSplus arrays are 2D.
Initialization.
Parameters:
tag – string
Reference name for quantity. This will be in the form DTS_x_y where x is the quantity (eg DENS, TEMP)
and y is the channel (eg 0 for the channel closest to the plate). Eg ‘DTS_TEMP_0’. Case independent.
A convenient way to make a DI_signal object when the times and values are coming from file. For example
a custom-read of some data from an IDL save file. In these cases, it is sometimes easier just to
pass the raw data straight to the object and then it will be able to handle the filtering, remapping etc. in
a way that is equivalent to the other diagnostics.
Initialize DI object with tag as specified.
:param tag: string
Tag name for signal, used for identification purposes
No fetching actually occurs for this subclass. In this case, the otherwise-sourced data is
simply placed into the attributes that are required by the processing methods in DI_signal.
This subclass can’t know anything about where the LP data is being saved in the tree. It can however operate
on one of the probe sub trees. That means the Langmuir_Probe module tree needs to be populated first,
and then one of the sub trees from its [‘OUTPUT’] can be passed into here, alongside the physical quantity
to be extracted from the tree.
Parameters:
probe_tree – OMFITTree
a subtree of the Langmuir_Probe module corresponding
Parameters:
param_name – string
The name of the physical parameter to be extracted from probe tree (eg. JSAT, DENS, or TEMP).
Parameters:
units – OMFITTree
Descriptors of the units for each of the param_name options. This often lives
in Langmuir_Toolbox[‘OUTPUTS’][‘LP_MDS’][‘units’]
Parameters:
processing_settings – Dict
settings dict containing filter_settings, smoothing_settings, DOD_tmin, DOD_tmax. See process_data.py
for full details.
inputNames – train on these inputs (use all arrays not starting with OUT_ if not specified)
outputNames – train on these outputs (use all arrays starting with OUT_ if not specified)
max_iterations – >0 max number of iterations
<0 max number of iterations without improvement
hidden_layer – list of integers defining the NN hidden layer topology
connection_rate – float from 0. to 1. defining the density of the synapses
noise – add gaussian noise to training set
fraction – fraction of data used for training (the rest being for validation)
fraction>0 fraction random splitting
-1<fraction<0 fraction sequential splitting
fraction>1 index sequential splitting
norm_output – normalize outputs
output_mean_0 – force average of normalized outputs to have 0 mean
robust_stats – 0<x<100 percentile of data to be considered
=0 mean and std
<0 median and mad
weight_decay – exponential forget of the weight
spring_decay – link training weight decay to the validation error
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Convert to the OMFITfocuscoils ascii file format, which stores Fourier Harmonics
of the coils. These files have additional settings, which can be set using key
word arguments. Be sure to sync these with the focus input namelist!
Parameters:
filename – str. Name of new file
nfcoil – int. Number of harmonics in decomposition
target_length – float. Target length of coils. Zero targets the initial length.
coil_type – int.
coil_symm – int. 0 for no symmetry. 1 for toroidal symmetry matching the boundary bnfp.
nseg – int. Number of segments. Default (None) uses the number of segments in the original file
Estimates uncertainty in some quantity indicated by reference.
The primary method relies on assuming that high frequency variation is noise that can be described as
random measurement error or uncertainty. This requires a definition of “high” frequency, which is based on
the system smoothing timescale.
Parameters:
reference – string
The data to be analyzed are x, y = self[‘history’][‘x’], self[‘history’][reference]
min_frac_err – float
Minimum error in y as a fraction of y
max_frac_err – float
Maximum error in y as a fraction of y
Returns:
float array
Uncertainty in self[‘history’][reference], with matching shape and units.
p_first_order(par, x, u, uniform_dx=None)[source]¶
Calculates expected response to a given target waveform using a first order plus dead time (FOPDT) model
Parameters:
par – Parameters instance
Fit parameters
x – float array
independent variable
u – float array
command or actuator history as a function of x
uniform_dx – bool
Evenly spaced x?
Returns:
float array
Expected response vs. time
first_order(x, u, y0, gain, lag, scale, d_gain_du=0, ex=0, u0=None)[source]¶
Calculates expected response to a given target waveform using a first order plus dead time (FOPDT) model
Parameters:
x – float array
independent variable
u – float array
command or actuator history as a function of x
y0 – float
Initial y value (y units)
gain – float
Gain = delta_y / delta_u: how much will response variable change given a change in command?
Units = y units / u units
lag – float
How long before a change in command begins to cause a change in response? (x units)
scale – float
Timescale for response (x units)
d_gain_du – float
Change in gain per change in u away from u0. This is normally exactly 0. Should you really change it?
ex – float
Exponent ex in transformation Y = y * (y/y0)**ex. Transforms output y after running model.
It is a modification of the standard model.
This seemed like it was worth a shot, but it didn’t do me any good.
second_order(x, u, y0, gain, lag, scale, damping, d_gain_du=0, ex=0, u0=None, uniform_dx=None)[source]¶
Calculates expected response to a given target waveform using a second order plus dead time (SOPDT) model
Parameters:
x – float array
independent variable
u – float array
command or actuator history as a function of x
y0 – float
Initial y value (y units)
gain – float
Gain = delta_y / delta_u: how much will response variable change given a change in command?
Units = y units / u units
lag – float
How long before a change in command begins to cause a change in response? (x units)
scale – float
Timescale for response (x units)
damping – float
unitless
d_gain_du – float
Change in gain per change in u away from u0. This is normally exactly 0. Should you really change it?
ex – float
Exponent X for transforming to Y in in Y = y (y/y0)^X
u0 – float
Reference value of u. Defaults to u[0].
uniform_dx – bool
x is evenly spaced
Returns:
float array
Expected response vs. time
third_order(x, u, y0, gain, lag, scale, damping, a3, d_gain_du=0, ex=0, u0=None, uniform_dx=None)[source]¶
Calculates expected response to a given target waveform using a third order plus dead time (TOPDT) model
Where I made up the third order expression and the exact implementation may not be standard.
Because this isn’t confirmed as a standard sys id model, it is not recommended for use.
Parameters:
x – float array
independent variable
u – float array
command or actuator history as a function of x
y0 – float
Initial y value (y units)
gain – float
Gain = delta_y / delta_u: how much will response variable change given a change in command?
Units = y units / u units
lag – float
How long before a change in command begins to cause a change in response? (x units)
scale – float
Timescale for response (x units)
damping – float
unitless
a3 – float
unitless factor associated with third order term
d_gain_du – float
Change in gain per change in u away from u0. This is normally exactly 0. Should you really change it?
ex – float
Exponent X for transforming to Y in in Y = y (y/y0)^X
u0 – float
Reference value of u. Defaults to u[0].
order – int
Order of the model to use. Must be already complete.
x – float array
Independent variable
1D: single shot
2D: multi-shot
y – float array
Response data as a function of x
u – float array
Target or command data as a function of x
y_err – float array [optional]
Uncertainty in y, matching shape and units of y
extra_data –
list of dicts [optional]
Each item contains:
value: float array (required). Must match dimensions of x or supply its own xx
xx: dependent variable associated with value. Required if value isn’t the same shape as x
axes: ‘y’ or ‘u’, denoting whether quantity should be grouped w response or cmd/target (optional)
label: string (optional)
color: matplotlib color spec (optional)
linestyle: matplotlib line style spec (optional)
plotkw: dict: addtional plot keywords (optional)
split – bool
When plotting multiple shots, don’t overlay them; make more subplots so each can be separate.
Set to none to split when twod, and leave alone when only one shot is considered.
kw –
Additional keywords
show_guess: bool [optional]
Switches display of guesses on/off.
Defaults to on for single shot mode and off for multi-shot mode.
show_model: bool
Switches display of fit model on/off. Defaults to on. One might want to turn it off if one needs to
customize just this trace with a separate plot command later.
Treat u as a command with different units and scale than y, as opposed to a target,
which should be comparable to y. Defaults to reading from the fit output.
name: string
Tag name of fit output
annotate: bool
Mark annotations on the plot. Ignored if X is 2D unless ishot is used to select a single shot.
ishot: int or None
If int, selects which column in multi-shot/2D input to plot. If None, plots all shots/column.
Ignored if x is 1D.
This is an index from 0, not an actual literal shot number.
Manages fits to control responses to commands to identify system parameters
Parameters:
x –
float array
Independent variable; can be used for weighting steps in case of uneven spacing of data
Single shot mode: x should be 1D
Multi-shot mode: x should be 2D, with x[:, i] being the entry for shot i.
If some shots have shorter x than others, pad the end with NaNs to get consistent length.
Variables like gain, lag, and scale will be fit across all shots together
assuming that the same system is being studied in each.
response – float array matching shape of x
Dependent variable as a function of x.
These are actual measurements of the quantity that the system tried to control.
command – float array matching shape of x
Dependent variable as a function of x
This is the command that was used to try to produce a response in y
response_uncertainty – float array [optional]
Uncertainty in response, matching dimensions and units of response. Defaults to 1.
order – int
time_range – two element numeric iterable in x units
Used to control gathering of data if x, response, command, and target aren’t supplied directly.
If x units are converted using ???_x_factor, specify time_range in final units after conversion.
enable]_pointname ([response, command, target,) –
string
Alternative to supplying x, y data for response, command, or target.
Requires the following to be defined:
device
shot
time_range
Also considers the following, if supplied:
???_treename
???_x_factor
???_y_factor
Where ??? is one of [response, command, target]
enable]_treename ([response, command, target,) – string or None
string: MDSplus tree to use as data source for [response, command, or target]
None: Use PTDATA
enable]_x_factor ([response, command, target,) – float
Multiply x data by this * overall_x_factor.
All factors are applied immediately after gathering from MDSplus and before performing any operations
or comparisons. All effective factors are the product of an individual factor and an overall factor.
So, response x data are mds.dim_of(0) * response_x_factor * overall_x_factor
enable_value – float
If supplied, data gathered using enable_pointname and enable_treename must be equal to this value in
order for the command to be non-zero.
x – float array
Independent variable; can be used for weighting steps in case of uneven spacing of data
measurement – float array
Dependent variable as a function of x.
These are actual measurements of the quantity that the system tried to control.
target – float or float array
Dependent variable as a function of x, or a scalar value.
This is the target for measurement. Perfect control would be measurement = target at every point.
measurement_uncertainty – float array [optional]
Uncertainty in measurement in matching dimensions and units. If not supplied, it will be estimated.
enable – float array [optional]
Enable switch. Don’t count activity when control is disabled.
Disabled means when the enable switch doesn’t match enable_value.
command – float array [optional]
Command to the actuator.
Some plots will be skipped if this is not provided, but most analysis can continue.
xwin – list of two-element numeric iterables in units of x
Windows for computing some control metrics, like integral_square_error.
You can use this to highlight some step changes in the target and get responses to different steps.
units – string
Units of measurement and target
command_units – string [optional]
Units of command, if supplied. Used for display.
time_units – string
Units of x
Used for display, so quote units after applying overall_x_factor or measurement_x_factor, etc.
time_range – two element numeric iterable in x units
Used to control gathering of data if x, response, command, and target aren’t supplied directly.
If x units are converted using ???_x_factor, specify time_range in final units after conversion.
string
Alternative to supplying x, y data for measurement, target, measurement_uncertainty, enable, or command.
Requires the following to be defined:
device
shot
time_range
Also considers the following, if supplied:
???_treename
???_x_factor
???_y_factor
Where ??? is one of [measurement, target, measurement_uncertainty]
command]_treename ([measurement, target, enable,) – string or None
string: MDSplus tree to use as data source for measurement or target
None: Use PTDATA
command]_x_factor ([measurement, target, enable,) – float
Multiply x data by this * overall_x_factor.
All factors are applied immediately after gathering from MDSplus and before performing any operations
or comparisons. All effective factors are the product of an individual factor and an overall factor.
So, target x data are mds.dim_of(0) * target_x_factor * overall_x_factor
Assumes that turning on the control counts as changing the target and makes modifications
Prepends history with a step where the target matches the initial measured value.
Then if the measurement is initially about 12 eV and the control is turned on with
a target of 5 eV, it will act like the target changed from 12 to 5.
Plots primary quantities like measurement and target
Parameters:
ax – Axes instance
time_range – sorted two element numeric iterable [optional]
Zoom into only part of the data by specifying a new time range in xunits (probably seconds).
If not provided, the default time_range will be used.
kw –
extra keywords passed to pyplot.plot()
Recommended: instead of setting color here, set
self[‘__plot__’][‘target_color’] and self[‘__plot__’][‘measurement_color’]
Recommended: do not set label, as it will affect both the target and measurement
norm – bool
Normalize quantities based on typical independent and dependent data scales or intervals.
The norm factors are in self[‘summary’][‘norm’] and self[‘summary’][‘xnorm’]
time_range – sorted two element numeric iterable [optional]
Zoom into only part of the data by specifying a new time range in xunits (probably seconds).
If not provided, the default time_range will be used.
quote_avg – bool
Include all-time average in a legend label
kw – extra keywords passed to pyplot.plot()
Recommended: don’t set label as it will override both series drawn by this function
Note: if alpha is set and numeric, the unsmoothed trace will have alpha *= 0.3
time_range – sorted two element numeric iterable [optional]
Zoom into only part of the data by specifying a new time range in xunits (probably seconds).
If not provided, the default time_range will be used.
kw – additional keywords passed to plot
Recommended: do not set color this way. Instead, set the list self[‘__plot__’][‘other_error_colors’]
Recommended: do not set label this way. It will override all three traces’ labels.
Plot derivatives and stuff that goes into correlation tests
Parameters:
ax – Axes instance
norm – bool
time_range – sorted two element numeric iterable [optional]
Zoom into only part of the data by specifying a new time range in xunits (probably seconds).
If not provided, the default time_range will be used.
kw – extra keywords passed to plot
Recommended: don’t set label or color
Alpha will be multiplied by 0.3 for some plots if it is numeric
Plots error vs. other quantities, like rate of change of target
Parameters:
ax – Axes instance
which – string
Controls the quantity on the x axis of the correlation plot
‘target’ has a special meaning
otherwise, it’s d(target)/dx
norm – bool
time_range – sorted two element numeric iterable [optional]
Zoom into only part of the data by specifying a new time range in xunits (probably seconds).
If not provided, the default time_range will be used.
kw – Additional keywords passed to pyplot.plot()
Recommended: Don’t set marker as it is set for both series
If alpha is numeric, it will be multiplied by 0.3 for some data but not for others
time_range – sorted two element numeric iterable [optional]
Zoom into only part of the data by specifying a new time range in xunits (probably seconds).
If not provided, the default time_range will be used.
kw – Additional keywords passed to pyplot.plot()
Recommended: Don’t set label as it will affect both series
If it is numeric, alpha will be multiplied by 0.3 for some data but not others
Replaces illegal characters in a string so it can be used as a key in the OMFIT tree
Parameters:
key – string
Returns:
string
omfit_classes.omfit_ctrl_analysis.remove_periodic_noise(x, y, baseline_interval, amp_threshold=0.1, min_freq='auto', max_freq=None, debug=False)[source]¶
Tries to remove periodic noise from a signal by characterizing a baseline interval and extrapolating
FFT signal during baseline interval, when there should be no real signal (all noise)
Cut off low amplitude parts of the FFT and those outside of min/max freq
Find frequencies where the baseline has high amplitude
Suppress frequency components that appear prominently in the baseline
Parameters:
x – 1D float array
y – 1D float array
baseline_interval – two element numeric iterable
Should give start and end of time range / x range to be used for noise
characterization. Both ends must be within the range of X.
amp_threshold – float
Fraction of peak FFT magnitude (not counting 0 frequency) to use as a threshold.
FFT components with magnitudes below this will be discarded.
min_freq – float [optional]
Also remove low frequencies while cleaning low amplitude components out of the FFT
max_freq – float [optional]
Also remove high frequencies while cleaning low amplitude components out of the FFT
debug – bool
Returns intermediate quantities along with the primary result
Returns:
1D float array or dictionary
If debug is False: 1D array
y, but with best estimate for periodic noise removed
Conform this object onto a new set of indexes, filling in
missing values using interpolation. If only one indexer is specified,
utils.uinterp1d is used. If more than one indexer is specified,
utils.URegularGridInterpolator is used.
:params copybool, optional
If copy=True, the returned array’s dataset contains only copied
variables. If copy=False and no reindexing is required then
original variables from this array’s dataset are returned.
:params method{‘linear’}, optional
Method to use for filling index values in indexers not found on
this data array:
* linear: Linear interpolation between points
:params interpolate_kwsdict, optional
Key word arguments passed to either uinterp1d (if len(indexers)==1) or
URegularGridInterpolator.
:params **indexersdict
Dictionary with keys given by dimension names and values given by
arrays of coordinates tick labels. Any mis-matched coordinate values
will be filled in with NaN, and any mis-matched dimension names will
simply be ignored.
Returns:
Another dataset array, with new coordinates and interpolated data.
Conform this object onto a new set of indexes, filling values along changed dimensions
using nu_conv on each in the order they are kept in the data object.
Parameters:
copy – bool, optional
If copy=True, the returned array’s dataset contains only copied
variables. If copy=False and no reindexing is required then
original variables from this array’s dataset are returned.
method – str/function, optional
Window function used in nu_conv for filling index values in indexers.
window_sizes – dict, optional
Window size used in nu_conv along each dimension specified in indexers.
Note, no convolution is performed on dimensions not explicitly given in indexers.
causal – bool, optional
Passed to nu_conv, where it forces window function f(x>0)=0.
**indexers – dict
Dictionary with keys given by dimension names and values given by
arrays of coordinate’s tick labels. Any mis-matched coordinate values
will be filled in with NaN, and any mis-matched dimension names will
simply be ignored.
interpolate – False or number
Parameter indicating to interpolate data so that there are interpolate
number of data points within a time window
std_dev – str/int.
Accepted strings are ‘propagate’ or ‘none’. Future options will include ‘mean’, and ‘population’.
Setting to an integer will convolve the error uncertainties to the std_dev power before taking the std_dev root.
Returns:
DataArray
Another dataset array, with new coordinates and interpolated data.
One dimensional smoothing. Every projection of the DataArray values
in the specified dimension is passed to the OMFIT nu_conv smoothing function.
Parameters:
axis (int,str (if data is DataArray or Dataset)) – Axis along which 1D smoothing is applied.
Documentation for the smooth function is below.
Convolution of a non-uniformly discretized array with window function.
The output values are np.nan where no points are found in finite windows (weight is zero).
The gaussian window is infinite in extent, and thus returns values for all xo.
Supports uncertainties arrays.
If the input –does not– have associated uncertainties, then the output will –not– have associated uncertainties.
Parameters:
yi – array_like (…,N,…). Values of input array
xi – array_like (N,). Original grid points of input array (default y indicies)
xo – array_like (M,). Output grid points of convolution array (default xi)
window_size – float.
Width of passed to window function (default maximum xi step).
For the Gaussian, sigma=window_size/4. and the convolution is integrated across +/-4.*sigma.
window_function – str/function.
Accepted strings are ‘hanning’,’bartlett’,’blackman’,’gaussian’, or ‘boxcar’.
Function should accept x and window_size as arguments and return a corresponding weight.
axis – int. Axis of y along which convolution is performed
causal – int. Forces f(x>0) = 0.
interpolate – False or integer number > 0
Paramter indicating to interpolate data so that there are`interpolate`
number of data points within a time window. This is useful in presence of sparse
data, which would result in stair-case output if not interpolated.
The integer value sets the # of points per window size.
std_dev – str/int
Accepted strings are ‘none’, ‘propagate’, ‘population’, ‘expand’, ‘deviation’, ‘variance’.
Only ‘population’ and ‘none’ are valid if yi is not an uncertainties array (i.e. std_devs(yi) is all zeros).
Setting to an integer will convolve the error uncertainties to the std_dev power before taking the std_dev root.
std_dev = ‘propagate’ is true propagation of errors (slow if not interpolating)
std_dev = ‘population’ is the weighted “standard deviation” of the points themselves (strictly correct for the boxcar window)
std_dev = ‘expand’ is propagation of errors weighted by w~1/window_function
std_dev = ‘deviation’ is equivalent to std_dev=1
std_dev = ‘variance’ is equivalent to std_dev=2
Method for saving xarray Dataset to NetCDF, with support for boolean, uncertain, and complex data
Also, attributes support OMFITexpressions, lists, tuples, dicts
Parameters:
data – Dataset object to be saved
path – filename to save NetCDF to
complex_dim – str. Name of extra dimension (0,1) assigned to (real, imag) complex data.
*args – arguments passed to Dataset.to_netcdf function
**kw – keyword arguments passed to Dataset.to_netcdf function
Returns:
output from Dataset.to_netcdf function
ORIGINAL Dataset.to_netcdf DOCUMENTATION
Write dataset contents to a netCDF file.
pathstr, path-like or file-like, optional
Path to which to save this dataset. File-like objects are only
supported by the scipy engine. If no path is provided, this
function returns the resulting netCDF file as bytes; in this case,
we need to use scipy, which does not support netCDF version 4 (the
default format becomes NETCDF3_64BIT).
mode{“w”, “a”}, default: “w”
Write (‘w’) or append (‘a’) mode. If mode=’w’, any existing file at
this location will be overwritten. If mode=’a’, existing variables
will be overwritten.
NETCDF4: Data is stored in an HDF5 file, using netCDF4 API
features.
NETCDF4_CLASSIC: Data is stored in an HDF5 file, using only
netCDF 3 compatible API features.
NETCDF3_64BIT: 64-bit offset version of the netCDF 3 file format,
which fully supports 2+ GB files, but is only compatible with
clients linked against netCDF version 3.6.0 or later.
NETCDF3_CLASSIC: The classic netCDF 3 file format. It does not
handle 2+ GB files very well.
All formats are supported by the netCDF4-python library.
scipy.io.netcdf only supports the last two formats.
The default format is NETCDF4 if you are saving a file to disk and
have the netCDF4-python library available. Otherwise, xarray falls
back to using scipy to write netCDF files and defaults to the
NETCDF3_64BIT format (scipy does not support netCDF4).
groupstr, optional
Path to the netCDF4 group in the given file to open (only works for
format=’NETCDF4’). The group(s) will be created if necessary.
engine{“netcdf4”, “scipy”, “h5netcdf”}, optional
Engine to use when writing netCDF files. If not provided, the
default engine is chosen based on available dependencies, with a
preference for ‘netcdf4’ if writing to a file on disk.
encodingdict, optional
Nested dictionary with variable names as keys and dictionaries of
variable specific encodings as values, e.g.,
{"my_variable":{"dtype":"int16","scale_factor":0.1,"zlib":True},...}
The h5netcdf engine supports both the NetCDF4-style compression
encoding parameters {"zlib":True,"complevel":9} and the h5py
ones {"compression":"gzip","compression_opts":9}.
This allows using any compression plugin installed in the HDF5
library, e.g. LZF.
unlimited_dimsiterable of hashable, optional
Dimension(s) that should be serialized as unlimited dimensions.
By default, no dimensions are treated as unlimited dimensions.
Note that unlimited_dims may also be set via
dataset.encoding["unlimited_dims"].
compute: bool, default: True
If true compute immediately, otherwise return a
dask.delayed.Delayed object that can be computed later.
invalid_netcdf: bool, default: False
Only valid along with engine="h5netcdf". If True, allow writing
hdf5 files which are invalid netcdf as described in
https://github.com/shoyer/h5netcdf.
Method for loading from xarray Dataset saved as netcdf file, with support for boolean, uncertain, and complex data.
Also, attributes support OMFITexpressions, lists, tuples, dicts
Parameters:
filename_or_obj – str, file or xarray.backends.*DataStore
Strings are interpreted as a path to a netCDF file or an OpenDAP URL
and opened with python-netCDF4, unless the filename ends with .gz, in
which case the file is gunzipped and opened with scipy.io.netcdf (only
netCDF3 supported). File-like objects are opened with scipy.io.netcdf
(only netCDF3 supported).
complex_dim – str, name of length-2 dimension (0,1) containing (real, imag) complex data.
*args – arguments passed to xarray.open_dataset function
**kw – keywords arguments passed to xarray.open_dataset function
Returns:
xarray Dataset object containing the loaded data
ORIGINAL xarray.open_dataset DOCUMENTATION
Open and decode a dataset from a file or file-like object.
Strings and Path objects are interpreted as a path to a netCDF file
or an OpenDAP URL and opened with python-netCDF4, unless the filename
ends with .gz, in which case the file is gunzipped and opened with
scipy.io.netcdf (only netCDF3 supported). Byte-strings or file-like
objects are opened by scipy.io.netcdf (netCDF3) or h5py (netCDF4/HDF).
engine{“netcdf4”, “scipy”, “pydap”, “h5netcdf”, “pynio”, “cfgrib”, “pseudonetcdf”, “zarr”} or subclass of xarray.backends.BackendEntrypoint, optional
Engine to use when reading files. If not provided, the default engine
is chosen based on available dependencies, with a preference for
“netcdf4”. A custom backend class (a subclass of BackendEntrypoint)
can also be used.
chunksint or dict, optional
If chunks is provided, it is used to load the new dataset into dask
arrays. chunks=-1 loads the dataset with dask using a single
chunk for all arrays. chunks={} loads the dataset with dask using
engine preferred chunks if exposed by the backend, otherwise with
a single chunk for all arrays.
chunks='auto' will use dask auto chunking taking into account the
engine preferred chunks. See dask chunking for more details.
cachebool, optional
If True, cache data loaded from the underlying datastore in memory as
NumPy arrays when accessed to avoid reading from the underlying data-
store multiple times. Defaults to True unless you specify the chunks
argument to use dask, in which case it defaults to False. Does not
change the behavior of coordinates corresponding to dimensions, which
always load their data from disk into a pandas.Index.
decode_cfbool, optional
Whether to decode these variables, assuming they were saved according
to CF conventions.
mask_and_scalebool, optional
If True, replace array values equal to _FillValue with NA and scale
values according to the formula original_values * scale_factor +
add_offset, where _FillValue, scale_factor and add_offset are
taken from variable attributes (if they exist). If the _FillValue or
missing_value attribute contains multiple values a warning will be
issued and all array values matching one of the multiple values will
be replaced by NA. mask_and_scale defaults to True except for the
pseudonetcdf backend. This keyword may not be supported by all the backends.
decode_timesbool, optional
If True, decode times encoded in the standard NetCDF datetime format
into datetime objects. Otherwise, leave them encoded as numbers.
This keyword may not be supported by all the backends.
decode_timedeltabool, optional
If True, decode variables and coordinates with time units in
{“days”, “hours”, “minutes”, “seconds”, “milliseconds”, “microseconds”}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
This keyword may not be supported by all the backends.
use_cftime: bool, optional
Only relevant if encoded dates come from a standard calendar
(e.g. “gregorian”, “proleptic_gregorian”, “standard”, or not
specified). If None (default), attempt to decode times to
np.datetime64[ns] objects; if this is not possible, decode times to
cftime.datetime objects. If True, always decode times to
cftime.datetime objects, regardless of whether or not they can be
represented using np.datetime64[ns] objects. If False, always
decode times to np.datetime64[ns] objects; if this is not possible
raise an error. This keyword may not be supported by all the backends.
concat_charactersbool, optional
If True, concatenate along the last dimension of character arrays to
form string arrays. Dimensions will only be concatenated over (and
removed) if they have no corresponding variable and if they are only
used as the last dimension of character arrays.
This keyword may not be supported by all the backends.
decode_coordsbool or {“coordinates”, “all”}, optional
Controls which variables are set as coordinate variables:
“coordinates” or True: Set variables referred to in the
'coordinates' attribute of the datasets or individual variables
as coordinate variables.
“all”: Set variables referred to in 'grid_mapping', 'bounds' and
other attributes as coordinate variables.
drop_variables: str or iterable, optional
A variable or list of variables to exclude from being parsed from the
dataset. This may be useful to drop variables with problems or
inconsistent values.
backend_kwargs: dict
Additional keyword arguments passed on to the engine open function,
equivalent to **kwargs.
Additional keyword arguments passed on to the engine open function.
For example:
‘group’: path to the netCDF4 group in the given file to open given as
a str,supported by “netcdf4”, “h5netcdf”, “zarr”.
‘lock’: resource lock to use when reading data from disk. Only
relevant when using dask or another form of parallelism. By default,
appropriate locks are chosen to safely read and write files with the
currently active dask scheduler. Supported by “netcdf4”, “h5netcdf”,
“scipy”, “pynio”, “pseudonetcdf”, “cfgrib”.
See engine open function for kwargs accepted by each specific engine.
open_dataset opens the file with read-only access. When you modify
values of a Dataset, even one linked to files on disk, only the in-memory
copy you are manipulating in xarray is modified: the original file on disk
is never touched.
Saves file using system move and copy commands if data in memory is unchanged,
and the exportDataset function if it has changed.
Saving NetCDF files takes much longer than loading them. Since 99% of the times NetCDF files are not edited
but just read, it makes sense to check if any changes was made before re-saving the same file from scratch.
If the files has not changed, than one can just copy the “old” file with a system copy.
Parameters:
force_write – bool. Forces the (re)writing of the file, even if the data is unchanged.
**kw – keyword arguments passed to Dataset.to_netcdf function
Saves file using system move and copy commands if data in memory is unchanged,
and the exportDataset function if it has changed.
Saving NetCDF files takes much longer than loading them. Since 99% of the times NetCDF files are not edited
but just read, it makes sense to check if any changes was made before re-saving the same file from scratch.
If the files has not changed, than one can just copy the “old” file with a system copy.
Parameters:
force_write – bool. Forces the (re)writing of the file, even if the data is unchanged.
**kw – keyword arguments passed to Dataset.to_netcdf function
path_or_bufa valid JSON str, path object or file-like object
Any valid string path is acceptable. The string could be a URL. Valid
URL schemes include http, ftp, s3, and file. For file URLs, a host is
expected. A local file could be:
file://localhost/path/to/table.json.
If you want to pass in a path object, pandas accepts any
os.PathLike.
By file-like object, we refer to objects with a read() method,
such as a file handle (e.g. via builtin open function)
or StringIO.
orientstr
Indication of expected JSON string format.
Compatible JSON strings can be produced by to_json() with a
corresponding orient value.
The set of possible orients is:
'split' : dict like
{index->[index],columns->[columns],data->[values]}
'records' : list like
[{column->value},...,{column->value}]
'index' : dict like {index->{column->value}}
'columns' : dict like {column->{index->value}}
'values' : just the values array
The allowed and default values depend on the value
of the typ parameter.
when typ=='series',
allowed orients are {'split','records','index'}
default is 'index'
The Series index must be unique for orient 'index'.
when typ=='frame',
allowed orients are {'split','records','index','columns','values','table'}
default is 'columns'
The DataFrame index must be unique for orients 'index' and
'columns'.
The DataFrame columns must be unique for orients 'index',
'columns', and 'records'.
typ{‘frame’, ‘series’}, default ‘frame’
The type of object to recover.
dtypebool or dict, default None
If True, infer dtypes; if a dict of column to dtype, then use those;
if False, then don’t infer dtypes at all, applies only to the data.
For all orient values except 'table', default is True.
Changed in version 0.25.0: Not applicable for orient='table'.
convert_axesbool, default None
Try to convert the axes to the proper dtypes.
For all orient values except 'table', default is True.
Changed in version 0.25.0: Not applicable for orient='table'.
convert_datesbool or list of str, default True
If True then default datelike columns may be converted (depending on
keep_default_dates).
If False, no dates will be converted.
If a list of column names, then those columns will be converted and
default datelike columns may also be converted (depending on
keep_default_dates).
keep_default_datesbool, default True
If parsing dates (convert_dates is not False), then try to parse the
default datelike columns.
A column label is datelike if
it ends with '_at',
it ends with '_time',
it begins with 'timestamp',
it is 'modified', or
it is 'date'.
numpybool, default False
Direct decoding to numpy arrays. Supports numeric data only, but
non-numeric column and index labels are supported. Note also that the
JSON ordering MUST be the same for each term if numpy=True.
Deprecated since version 1.0.0.
precise_floatbool, default False
Set to enable usage of higher precision (strtod) function when
decoding string to double values. Default (False) is to use fast but
less precise builtin functionality.
date_unitstr, default None
The timestamp unit to detect if converting dates. The default behaviour
is to try and detect the correct precision, but if this is not desired
then pass one of ‘s’, ‘ms’, ‘us’ or ‘ns’ to force parsing only seconds,
milliseconds, microseconds or nanoseconds respectively.
Return JsonReader object for iteration.
See the line-delimited json docs
for more information on chunksize.
This can only be passed if lines=True.
If this is None, the file will be read into memory all at once.
Changed in version 1.2: JsonReader is a context manager.
For on-the-fly decompression of on-disk data. If ‘infer’, then use
gzip, bz2, zip or xz if path_or_buf is a string ending in
‘.gz’, ‘.bz2’, ‘.zip’, or ‘xz’, respectively, and no decompression
otherwise. If using ‘zip’, the ZIP file must contain only one data
file to be read in. Set to None for no decompression.
nrowsint, optional
The number of lines from the line-delimited jsonfile that has to be read.
This can only be passed if lines=True.
If this is None, all the rows will be returned.
New in version 1.1.
storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to urllib as header options. For other URLs (e.g.
starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to
fsspec. Please see fsspec and urllib for more details.
Specific to orient='table', if a DataFrame with a literal
Index name of index gets written with to_json(), the
subsequent read operation will incorrectly set the Index name to
None. This is because index is also used by DataFrame.to_json()
to denote a missing Index name, and the subsequent
read_json() operation cannot distinguish between the two. The same
limitation is encountered with a MultiIndex and any names
beginning with 'level_'.
Encoding/decoding a Dataframe using 'split' formatted JSON:
>>> df.to_json(orient='split') '{"columns":["col 1","col 2"],"index":["row 1","row 2"],"data":[["a","b"],["c","d"]]}'>>> pd.read_json(_,orient='split') col 1 col 2row 1 a brow 2 c d
Encoding/decoding a Dataframe using 'index' formatted JSON:
Data structure also contains labeled axes (rows and columns).
Arithmetic operations align on both row and column labels. Can be
thought of as a dict-like container for Series objects. The primary
pandas data structure.
datandarray (structured or homogeneous), Iterable, dict, or DataFrame
Dict can contain Series, arrays, constants, dataclass or list-like objects. If
data is a dict, column order follows insertion-order.
Changed in version 0.25.0: If data is a list of dicts, column order follows insertion-order.
indexIndex or array-like
Index to use for resulting frame. Will default to RangeIndex if
no indexing information part of input data and no index provided.
columnsIndex or array-like
Column labels to use for resulting frame when data does not have them,
defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels,
will perform column selection instead.
dtypedtype, default None
Data type to force. Only a single dtype is allowed. If None, infer.
copybool or None, default None
Copy data from inputs.
For dict data, the default of None behaves like copy=True. For DataFrame
or 2d ndarray input, the default of None behaves like copy=False.
DataFrame.from_records : Constructor from tuples, also record arrays.
DataFrame.from_dict : From dicts of Series, arrays, or dicts.
read_csv : Read a comma-separated values (csv) file into DataFrame.
read_table : Read general delimited file into DataFrame.
read_clipboard : Read text from clipboard into DataFrame.
bufstr, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
columnssequence, optional, default None
The subset of columns to write. Writes all columns by default.
col_spaceint, list or dict of int, optional
The minimum width of each column.
headerbool or sequence, optional
Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names.
indexbool, optional, default True
Whether to print index (row) labels.
na_repstr, optional, default ‘NaN’
String representation of NaN to use.
formatterslist, tuple or dict of one-param. functions, optional
Formatter functions to apply to columns’ elements by position or
name.
The result of each function must be a unicode string.
List/tuple must be of length equal to the number of columns.
Formatter function to apply to columns’ elements if they are
floats. This function must return a unicode string and will be
applied only to the non-NaN elements, with NaN being
handled by na_rep.
Changed in version 1.2.0.
sparsifybool, optional, default True
Set to False for a DataFrame with a hierarchical index to print
every multiindex key at each row.
index_namesbool, optional, default True
Prints the names of the indexes.
justifystr, default None
How to justify the column labels. If None uses the option from
the print configuration (controlled by set_option), ‘right’ out
of the box. Valid values are
left
right
center
justify
justify-all
start
end
inherit
match-parent
initial
unset.
max_rowsint, optional
Maximum number of rows to display in the console.
min_rowsint, optional
The number of rows to display in the console in a truncated repr
(when number of rows is above max_rows).
max_colsint, optional
Maximum number of columns to display in the console.
show_dimensionsbool, default False
Display DataFrame dimensions (number of rows by number of columns).
decimalstr, default ‘.’
Character recognized as decimal separator, e.g. ‘,’ in Europe.
line_widthint, optional
Width to wrap a line in characters.
max_colwidthint, optional
Max width to truncate each column in characters. By default, no limit.
Because iterrows returns a Series for each row,
it does not preserve dtypes across the rows (dtypes are
preserved across columns for DataFrames). For example,
To preserve dtypes while iterating over the rows, it is better
to use itertuples() which returns namedtuples of the values
and which is generally faster than iterrows.
You should never modify something you are iterating over.
This is not guaranteed to work in all cases. Depending on the
data types, the iterator returns a copy and not a view, and writing
to it will have no effect.
An object to iterate over namedtuples for each row in the
DataFrame with the first field possibly being the index and
following fields being the column values.
The column names will be renamed to positional names if they are
invalid Python identifiers, repeated, or start with an underscore.
On python versions < 3.7 regular tuples are returned for DataFrames
with a large number of columns (>254).
If other is a Series, return the matrix product between self and
other as a Series. If other is a DataFrame or a numpy.array, return
the matrix product of self and other in a DataFrame of a np.array.
The dimensions of DataFrame and other must be compatible in order to
compute the matrix multiplication. In addition, the column names of
DataFrame and the index of other must contain the same values, as they
will be aligned prior to the multiplication.
The dot method for Series computes the inner product, instead of the
matrix product here.
Of the form {field : array-like} or {field : dict}.
orient{‘columns’, ‘index’}, default ‘columns’
The “orientation” of the data. If the keys of the passed dict
should be the columns of the resulting DataFrame, pass ‘columns’
(default). Otherwise if the keys should be rows, pass ‘index’.
dtypedtype, default None
Data type to force, otherwise infer.
columnslist, default None
Column labels to use when orient='index'. Raises a ValueError
if used with orient='columns'.
By default, the dtype of the returned array will be the common NumPy
dtype of all types in the DataFrame. For example, if the dtypes are
float16 and float32, the results dtype will be float32.
This may require copying data and coercing values, which may be
expensive.
Whether to ensure that the returned value is not a view on
another array. Note that copy=False does not ensure that
to_numpy() is no-copy. Rather, copy=True ensure that
a copy is made, even if not strictly necessary.
na_valueAny, optional
The value to use for missing values. The default value depends
on dtype and the dtypes of the DataFrame columns.
‘records’ : list like
[{column -> value}, … , {column -> value}]
‘index’ : dict like {index -> {column -> value}}
Abbreviations are allowed. s indicates series and sp
indicates split.
intoclass, default dict
The collections.abc.Mapping subclass used for all Mappings
in the return value. Can be the actual class or an empty
instance of the mapping type you want. If you want a
collections.defaultdict, you must pass it initialized.
List of BigQuery table fields to which according DataFrame
columns conform to, e.g. [{'name':'col1','type':'STRING'},...]. If schema is not provided, it will be
generated according to dtypes of DataFrame columns. See
BigQuery API documentation on available names of a field.
New in version 0.3.1 of pandas-gbq.
locationstr, optional
Location where the load job should run. See the BigQuery locations
documentation for a
list of available locations. The location must match that of the
target dataset.
New in version 0.5.0 of pandas-gbq.
progress_barbool, default True
Use the library tqdm to show the progress bar for the upload,
chunk by chunk.
Credentials for accessing Google APIs. Use this parameter to
override default credentials, such as to use Compute Engine
google.auth.compute_engine.Credentials or Service
Account google.oauth2.service_account.Credentials
directly.
datastructured ndarray, sequence of tuples or dicts, or DataFrame
Structured input data.
indexstr, list of fields, array-like
Field of array to use as the index, alternately a specific set of
input labels to use.
excludesequence, default None
Columns or fields to exclude.
columnssequence, default None
Column names to use. If the passed data do not have names
associated with them, this argument provides names for the
columns. Otherwise this argument indicates the order of the columns
in the result (any names not found in the data will become all-NA
columns).
coerce_floatbool, default False
Attempt to convert values of non-string, non-numeric objects (like
decimal.Decimal) to floating point, useful for SQL result sets.
Include index in resulting record array, stored in ‘index’
field or using the index label, if set.
column_dtypesstr, type, dict, default None
If a string or type, the data type to store all columns. If
a dictionary, a mapping of column names and indices (zero-indexed)
to specific data types.
index_dtypesstr, type, dict, default None
If a string or type, the data type to store all index levels. If
a dictionary, a mapping of index level names and indices
(zero-indexed) to specific data types.
String, path object (pathlib.Path or py._path.local.LocalPath) or
object implementing a binary write() function. If using a buffer
then the buffer will not be automatically closed after the file
data has been written.
Changed in version 1.0.0.
Previously this was “fname”
convert_datesdict
Dictionary mapping columns containing datetime types to stata
internal format to use when writing the dates. Options are ‘tc’,
‘td’, ‘tm’, ‘tw’, ‘th’, ‘tq’, ‘ty’. Column can be either an integer
or a name. Datetime columns that do not have a conversion type
specified will be converted to ‘tc’. Raises NotImplementedError if
a datetime column has timezone information.
write_indexbool
Write the index to Stata dataset.
byteorderstr
Can be “>”, “<”, “little”, or “big”. default is sys.byteorder.
time_stampdatetime
A datetime to use as file creation date. Default is the current
time.
data_labelstr, optional
A label for the data set. Must be 80 characters or smaller.
variable_labelsdict
Dictionary containing columns as keys and variable labels as
values. Each label must be 80 characters or smaller.
version{114, 117, 118, 119, None}, default 114
Version to use in the output dta file. Set to None to let pandas
decide between 118 or 119 formats depending on the number of
columns in the frame. Version 114 can be read by Stata 10 and
later. Version 117 can be read by Stata 13 or later. Version 118
is supported in Stata 14 and later. Version 119 is supported in
Stata 15 and later. Version 114 limits string variables to 244
characters or fewer while versions 117 and later allow strings
with lengths up to 2,000,000 characters. Versions 118 and 119
support Unicode characters, and version 119 supports more than
32,767 variables.
Version 119 should usually only be used when the number of
variables exceeds the capacity of dta format 118. Exporting
smaller datasets in format 119 may have unintended consequences,
and, as of November 2020, Stata SE cannot read version 119 files.
Changed in version 1.0.0: Added support for formats 118 and 119.
convert_strllist, optional
List of column names to convert to string columns to Stata StrL
format. Only available if version is 117. Storing strings in the
StrL format can produce smaller dta files if strings have more than
8 characters and values are repeated.
compressionstr or dict, default ‘infer’
For on-the-fly compression of the output dta. If string, specifies
compression mode. If dict, value at key ‘method’ specifies
compression mode. Compression mode must be one of {‘infer’, ‘gzip’,
‘bz2’, ‘zip’, ‘xz’, None}. If compression mode is ‘infer’ and
fname is path-like, then detect compression from the following
extensions: ‘.gz’, ‘.bz2’, ‘.zip’, or ‘.xz’ (otherwise no
compression). If dict and compression mode is one of {‘zip’,
‘gzip’, ‘bz2’}, or inferred as one of the above, other entries
passed as additional compression options.
New in version 1.1.0.
storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to urllib as header options. For other URLs (e.g.
starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to
fsspec. Please see fsspec and urllib for more details.
read_stata : Import Stata data files.
io.stata.StataWriter : Low-level writer for Stata data files.
io.stata.StataWriter117 : Low-level writer for version 117 files.
Additional keywords passed to pyarrow.feather.write_feather().
Starting with pyarrow 0.17, this includes the compression,
compression_level, chunksize and version keywords.
bufstr, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
modestr, optional
Mode in which file is opened, “wt” by default.
indexbool, optional, default True
Add index (row) labels.
New in version 1.1.0.
storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to urllib as header options. For other URLs (e.g.
starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to
fsspec. Please see fsspec and urllib for more details.
This function writes the dataframe as a parquet file. You can choose different parquet
backends, and have the option of compression. See
the user guide for more details.
If a string, it will be used as Root Directory path
when writing a partitioned dataset. By file-like object,
we refer to objects with a write() method, such as a file handle
(e.g. via builtin open function) or io.BytesIO. The engine
fastparquet does not accept file-like objects. If path is None,
a bytes object is returned.
Parquet library to use. If ‘auto’, then the option
io.parquet.engine is used. The default io.parquet.engine
behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ if
‘pyarrow’ is unavailable.
Name of the compression to use. Use None for no compression.
indexbool, default None
If True, include the dataframe’s index(es) in the file output.
If False, they will not be written to the file.
If None, similar to True the dataframe’s index(es)
will be saved. However, instead of being saved as values,
the RangeIndex will be stored as a range in the metadata so it
doesn’t require much space and is faster. Other indexes will
be included as columns in the file output.
partition_colslist, optional, default None
Column names by which to partition the dataset.
Columns are partitioned in the order they are given.
Must be None if path is not a string.
storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to urllib as header options. For other URLs (e.g.
starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to
fsspec. Please see fsspec and urllib for more details.
If you want to get a buffer to the parquet content you can use a io.BytesIO
object, as long as you don’t use partition_cols, which creates multiple files.
bufstr, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
columnssequence, optional, default None
The subset of columns to write. Writes all columns by default.
col_spacestr or int, list or dict of int or str, optional
The minimum width of each column in CSS length units. An int is assumed to be px units.
New in version 0.25.0: Ability to use str.
headerbool, optional
Whether to print column labels, default True.
indexbool, optional, default True
Whether to print index (row) labels.
na_repstr, optional, default ‘NaN’
String representation of NaN to use.
formatterslist, tuple or dict of one-param. functions, optional
Formatter functions to apply to columns’ elements by position or
name.
The result of each function must be a unicode string.
List/tuple must be of length equal to the number of columns.
Formatter function to apply to columns’ elements if they are
floats. This function must return a unicode string and will be
applied only to the non-NaN elements, with NaN being
handled by na_rep.
Changed in version 1.2.0.
sparsifybool, optional, default True
Set to False for a DataFrame with a hierarchical index to print
every multiindex key at each row.
index_namesbool, optional, default True
Prints the names of the indexes.
justifystr, default None
How to justify the column labels. If None uses the option from
the print configuration (controlled by set_option), ‘right’ out
of the box. Valid values are
left
right
center
justify
justify-all
start
end
inherit
match-parent
initial
unset.
max_rowsint, optional
Maximum number of rows to display in the console.
min_rowsint, optional
The number of rows to display in the console in a truncated repr
(when number of rows is above max_rows).
max_colsint, optional
Maximum number of columns to display in the console.
show_dimensionsbool, default False
Display DataFrame dimensions (number of rows by number of columns).
decimalstr, default ‘.’
Character recognized as decimal separator, e.g. ‘,’ in Europe.
bold_rowsbool, default True
Make the row labels bold in the output.
classesstr or list or tuple, default None
CSS class(es) to apply to the resulting html table.
escapebool, default True
Convert the characters <, >, and & to HTML-safe sequences.
notebook{True, False}, default False
Whether the generated HTML is for IPython Notebook.
borderint
A border=border attribute is included in the opening
<table> tag. Default pd.options.display.html.border.
encodingstr, default “utf-8”
Set character encoding.
New in version 1.0.
table_idstr, optional
A css id is included in the opening <table> tag if specified.
path_or_bufferstr, path object or file-like object, optional
File to write output to. If None, the output is returned as a
string.
indexbool, default True
Whether to include index in XML document.
root_namestr, default ‘data’
The name of root element in XML document.
row_namestr, default ‘row’
The name of row element in XML document.
na_repstr, optional
Missing data representation.
attr_colslist-like, optional
List of columns to write as attributes in row element.
Hierarchical columns will be flattened with underscore
delimiting the different levels.
elem_colslist-like, optional
List of columns to write as children in row element. By default,
all columns output as children of row element. Hierarchical
columns will be flattened with underscore delimiting the
different levels.
namespacesdict, optional
All namespaces to be defined in root element. Keys of dict
should be prefix names and values of dict corresponding URIs.
Default namespaces should be given empty string key. For
example,
namespaces={"":"https://example.com"}
prefixstr, optional
Namespace prefix to be used for every element and/or attribute
in document. This should be one of the keys in namespaces
dict.
encodingstr, default ‘utf-8’
Encoding of the resulting document.
xml_declarationbool, default True
Whether to include the XML declaration at start of document.
pretty_printbool, default True
Whether output should be pretty printed with indentation and
line breaks.
parser{‘lxml’,’etree’}, default ‘lxml’
Parser module to use for building of tree. Only ‘lxml’ and
‘etree’ are supported. With ‘lxml’, the ability to use XSLT
stylesheet is supported.
stylesheetstr, path object or file-like object, optional
A URL, file-like object, or a raw string containing an XSLT
script used to transform the raw XML output. Script should use
layout of elements and attributes from original output. This
argument requires lxml to be installed. Only XSLT 1.0
scripts and not later versions is currently supported.
For on-the-fly decompression of on-disk data. If ‘infer’, then use
gzip, bz2, zip or xz if path_or_buffer is a string ending in
‘.gz’, ‘.bz2’, ‘.zip’, or ‘xz’, respectively, and no decompression
otherwise. If using ‘zip’, the ZIP file must contain only one data
file to be read in. Set to None for no decompression.
storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g.
host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
are forwarded to urllib as header options. For other URLs (e.g.
starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to
fsspec. Please see fsspec and urllib for more details.
Whether to print the full summary. By default, the setting in
pandas.options.display.max_info_columns is followed.
bufwritable buffer, defaults to sys.stdout
Where to send the output. By default, the output is printed to
sys.stdout. Pass a writable buffer if you need to further process
the output.
max_colsint, optional
When to switch from the verbose to the truncated output. If the
DataFrame has more than max_cols columns, the truncated output
is used. By default, the setting in
pandas.options.display.max_info_columns is used.
memory_usagebool, str, optional
Specifies whether total memory usage of the DataFrame
elements (including the index) should be displayed. By default,
this follows the pandas.options.display.memory_usage setting.
True always show memory usage. False never shows memory usage.
A value of ‘deep’ is equivalent to “True with deep introspection”.
Memory usage is shown in human-readable units (base-2
representation). Without deep introspection a memory estimation is
made based in column dtype and number of rows assuming values
consume the same memory amount for corresponding dtypes. With deep
memory introspection, a real memory usage calculation is performed
at the cost of computational resources.
show_countsbool, optional
Whether to show the non-null counts. By default, this is shown
only if the DataFrame is smaller than
pandas.options.display.max_info_rows and
pandas.options.display.max_info_columns. A value of True always
shows the counts, and False never shows the counts.
null_countsbool, optional
Deprecated since version 1.2.0: Use show_counts instead.
Specifies whether to include the memory usage of the DataFrame’s
index in returned Series. If index=True, the memory usage of
the index is the first item in the output.
deepbool, default False
If True, introspect the data deeply by interrogating
object dtypes for system-level memory consumption, and include
it in the returned values.
Transposing a DataFrame with mixed dtypes will result in a homogeneous
DataFrame with the object dtype. In such a case, a copy of the data
is always made.
You can refer to variables
in the environment by prefixing them with an ‘@’ character like
@a+b.
You can refer to column names that are not valid Python variable names
by surrounding them in backticks. Thus, column names containing spaces
or punctuations (besides underscores) or starting with digits must be
surrounded by backticks. (For example, a column named “Area (cm^2)” would
be referenced as `Area(cm^2)`). Column names which are Python keywords
(like “list”, “for”, “import”, etc) cannot be used.
For example, if one of your columns is called aa and you want
to sum it with b, your query should be `aa`+b.
New in version 0.25.0: Backtick quoting introduced.
New in version 1.0.0: Expanding functionality of backtick quoting for more than only spaces.
inplacebool
Whether the query should modify the data in place or return
a modified copy.
The result of the evaluation of this expression is first passed to
DataFrame.loc and if that fails because of a
multidimensional key (e.g., a DataFrame) then the result will be passed
to DataFrame.__getitem__().
This method uses the top-level eval() function to
evaluate the passed query.
The query() method uses a slightly
modified Python syntax by default. For example, the & and |
(bitwise) operators have the precedence of their boolean cousins,
and and or. This is syntactically valid Python,
however the semantics are different.
You can change the semantics of the expression by passing the keyword
argument parser='python'. This enforces the same semantics as
evaluation in Python space. Likewise, you can pass engine='python'
to evaluate an expression using Python itself as a backend. This is not
recommended as it is inefficient compared to using numexpr as the
engine.
The DataFrame.index and
DataFrame.columns attributes of the
DataFrame instance are placed in the query namespace
by default, which allows you to treat both the index and columns of the
frame as a column in the frame.
The identifier index is used for the frame index; you can also
use the name of the index to identify it in a query. Please note that
Python keywords may not be used as identifiers.
For further details and examples see the query documentation in
indexing.
Backtick quoted variables
Backtick quoted variables are parsed as literal Python code and
are converted internally to a Python valid identifier.
This can lead to the following problems.
During parsing a number of disallowed characters inside the backtick
quoted string are replaced by strings that are allowed as a Python identifier.
These characters include all operators in Python, the space character, the
question mark, the exclamation mark, the dollar sign, and the euro sign.
For other characters that fall outside the ASCII range (U+0001..U+007F)
and those that are not further specified in PEP 3131,
the query parser will raise an error.
This excludes whitespace different than the space character,
but also the hashtag (as it is used for comments) and the backtick
itself (backtick can also not be escaped).
In a special case, quotes that make a pair around a backtick can
confuse the parser.
For example, `it's`>`that's` will raise an error,
as it forms a quoted string ('s>`that') with a backtick inside.
Evaluate a string describing operations on DataFrame columns.
Operates on columns only, not specific rows or elements. This allows
eval to run arbitrary code, which can make you vulnerable to code
injection if you pass user input to this function.
If the expression contains an assignment, whether to perform the
operation inplace and mutate the existing DataFrame. Otherwise,
a new DataFrame is returned.
The column names are keywords. If the values are
callable, they are computed on the DataFrame and
assigned to the new columns. The callable must not
change input DataFrame (though pandas doesn’t check it).
If the values are not callable, (e.g. a Series, scalar, or array),
they are simply assigned.
Assigning multiple columns within the same assign is possible.
Later items in ‘**kwargs’ may refer to newly created or modified
columns in ‘df’; items are computed and assigned into ‘df’ in order.
Label-based “fancy indexing” function for DataFrame.
Given equal-length arrays of row and column labels, return an
array of the values corresponding to each (row, col) pair.
Deprecated since version 1.2.0: DataFrame.lookup is deprecated,
use DataFrame.melt and DataFrame.loc instead.
For further details see
Looking up values by index/column labels.
Method to use for filling holes in reindexed Series:
pad / ffill: propagate last valid observation forward to next valid.
backfill / bfill: use NEXT valid observation to fill gap.
limitint, default None
If method is specified, this is the maximum number of consecutive
NaN values to forward/backward fill. In other words, if there is
a gap with more than this number of consecutive NaNs, it will only
be partially filled. If method is not specified, this is the
maximum number of entries along the entire axis where NaNs will be
filled. Must be greater than 0 if not None.
fill_axis{0 or ‘index’, 1 or ‘columns’}, default 0
Filling axis, method and limit.
broadcast_axis{0 or ‘index’, 1 or ‘columns’}, default None
Broadcast values along this axis, if aligning two objects of
different dimensions.
Conform Series/DataFrame to new index with optional filling logic.
Places NA/NaN in locations having no value in the previous index. A new object
is produced unless the new index is equivalent to the current one and
copy=False.
Method to use for filling holes in reindexed DataFrame.
Please note: this is only applicable to DataFrames/Series with a
monotonically increasing/decreasing index.
None (default): don’t fill gaps
pad / ffill: Propagate last valid observation forward to next
valid.
backfill / bfill: Use next valid observation to fill gap.
nearest: Use nearest valid observations to fill gap.
copybool, default True
Return a new object, even if the passed indexes are the same.
levelint or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuescalar, default np.NaN
Value to use for missing values. Defaults to NaN, but can be any
“compatible” value.
limitint, default None
Maximum number of consecutive elements to forward or backward fill.
toleranceoptional
Maximum distance between original and new labels for inexact
matches. The values of the index at the matching locations most
satisfy the equation abs(index[indexer]-target)<=tolerance.
Tolerance may be a scalar value, which applies the same tolerance
to all values, or list-like, which applies variable tolerance per
element. List-like includes list, tuple, array, Series, and must be
the same size as the index and its dtype must exactly match the
index’s type.
DataFrame.set_index : Set row labels.
DataFrame.reset_index : Remove row labels or move them to new columns.
DataFrame.reindex_like : Change to same indices as other DataFrame.
Create a new index and reindex the dataframe. By default
values in the new index that do not have corresponding
records in the dataframe are assigned NaN.
>>> new_index=['Safari','Iceweasel','Comodo Dragon','IE10',... 'Chrome']>>> df.reindex(new_index) http_status response_timeSafari 404.0 0.07Iceweasel NaN NaNComodo Dragon NaN NaNIE10 404.0 0.08Chrome 200.0 0.02
We can fill in the missing values by passing a value to
the keyword fill_value. Because the index is not monotonically
increasing or decreasing, we cannot use arguments to the keyword
method to fill the NaN values.
To further illustrate the filling functionality in
reindex, we will create a dataframe with a
monotonically increasing index (for example, a sequence
of dates).
The index entries that did not have a value in the original data frame
(for example, ‘2009-12-29’) are by default filled with NaN.
If desired, we can fill in the missing values using one of several
options.
For example, to back-propagate the last valid value to fill the NaN
values, pass bfill as an argument to the method keyword.
Please note that the NaN value present in the original dataframe
(at index value 2010-01-03) will not be filled by any of the
value propagation schemes. This is because filling while reindexing
does not look at dataframe values, but only compares the original and
desired indexes. If you do want to fill in the NaN values present
in the original dataframe, use the fillna() method.
Remove rows or columns by specifying label names and corresponding
axis, or by specifying directly index or column names. When using a
multi-index, labels on different levels can be removed by specifying
the level. See the user guide <advanced.shown_levels>
for more information about the now unused levels.
Dict-like or function transformations to apply to
that axis’ values. Use either mapper and axis to
specify the axis to target with mapper, or index and
columns.
indexdict-like or function
Alternative to specifying axis (mapper,axis=0
is equivalent to index=mapper).
columnsdict-like or function
Alternative to specifying axis (mapper,axis=1
is equivalent to columns=mapper).
axis{0 or ‘index’, 1 or ‘columns’}, default 0
Axis to target with mapper. Can be either the axis name
(‘index’, ‘columns’) or number (0, 1). The default is ‘index’.
copybool, default True
Also copy underlying data.
inplacebool, default False
Whether to return a new DataFrame. If True then value of copy is
ignored.
levelint or level name, default None
In case of a MultiIndex, only rename labels in the specified
level.
errors{‘ignore’, ‘raise’}, default ‘ignore’
If ‘raise’, raise a KeyError when a dict-like mapper, index,
or columns contains labels that are not present in the Index
being transformed.
If ‘ignore’, existing keys will be renamed and extra keys will be
ignored.
Value to use to fill holes (e.g. 0), alternately a
dict/Series/DataFrame of values specifying which value to use for
each index (for a Series) or column (for a DataFrame). Values not
in the dict/Series/DataFrame will not be filled. This value cannot
be a list.
Method to use for filling holes in reindexed Series
pad / ffill: propagate last valid observation forward to next valid
backfill / bfill: use next valid observation to fill gap.
axis{0 or ‘index’, 1 or ‘columns’}
Axis along which to fill missing values.
inplacebool, default False
If True, fill in-place. Note: this will modify any
other views on this object (e.g., a no-copy slice for a column in a
DataFrame).
limitint, default None
If method is specified, this is the maximum number of consecutive
NaN values to forward/backward fill. In other words, if there is
a gap with more than this number of consecutive NaNs, it will only
be partially filled. If method is not specified, this is the
maximum number of entries along the entire axis where NaNs will be
filled. Must be greater than 0 if not None.
downcastdict, default is None
A dict of item->dtype of what to downcast if possible,
or the string ‘infer’ which will try to downcast to an appropriate
equal type (e.g. float64 to int64 if possible).
>>> df=pd.DataFrame([[np.nan,2,np.nan,0],... [3,4,np.nan,1],... [np.nan,np.nan,np.nan,5],... [np.nan,3,np.nan,4]],... columns=list("ABCD"))>>> df A B C D0 NaN 2.0 NaN 01 3.0 4.0 NaN 12 NaN NaN NaN 53 NaN 3.0 NaN 4
Replace all NaN elements with 0s.
>>> df.fillna(0) A B C D0 0.0 2.0 0.0 01 3.0 4.0 0.0 12 0.0 0.0 0.0 53 0.0 3.0 0.0 4
We can also propagate non-null values forward or backward.
>>> df.fillna(method="ffill") A B C D0 NaN 2.0 NaN 01 3.0 4.0 NaN 12 3.0 4.0 NaN 53 3.0 3.0 NaN 4
Replace all NaN elements in column ‘A’, ‘B’, ‘C’, and ‘D’, with 0, 1,
2, and 3 respectively.
>>> values={"A":0,"B":1,"C":2,"D":3}>>> df.fillna(value=values) A B C D0 0.0 2.0 2.0 01 3.0 4.0 2.0 12 0.0 1.0 2.0 53 0.0 3.0 2.0 4
Only replace the first NaN element.
>>> df.fillna(value=values,limit=1) A B C D0 0.0 2.0 2.0 01 3.0 4.0 NaN 12 NaN 1.0 NaN 53 NaN 3.0 NaN 4
When filling using a DataFrame, replacement happens along
the same column names and same indices
>>> df2=pd.DataFrame(np.zeros((4,4)),columns=list("ABCE"))>>> df.fillna(df2) A B C D0 0.0 2.0 0.0 01 3.0 4.0 0.0 12 0.0 0.0 0.0 53 0.0 3.0 0.0 4
to_replacestr, regex, list, dict, Series, int, float, or None
How to find the values that will be replaced.
numeric, str or regex:
numeric: numeric values equal to to_replace will be
replaced with value
str: string exactly matching to_replace will be replaced
with value
regex: regexs matching to_replace will be replaced with
value
list of str, regex, or numeric:
First, if to_replace and value are both lists, they
must be the same length.
Second, if regex=True then all of the strings in both
lists will be interpreted as regexs otherwise they will match
directly. This doesn’t matter much for value since there
are only a few possible substitution regexes you can use.
str, regex and numeric rules apply as above.
dict:
Dicts can be used to specify different replacement values
for different existing values. For example,
{'a':'b','y':'z'} replaces the value ‘a’ with ‘b’ and
‘y’ with ‘z’. To use a dict in this way the value
parameter should be None.
For a DataFrame a dict can specify that different values
should be replaced in different columns. For example,
{'a':1,'b':'z'} looks for the value 1 in column ‘a’
and the value ‘z’ in column ‘b’ and replaces these values
with whatever is specified in value. The value parameter
should not be None in this case. You can treat this as a
special case of passing two lists except that you are
specifying the column to search in.
For a DataFrame nested dictionaries, e.g.,
{'a':{'b':np.nan}}, are read as follows: look in column
‘a’ for the value ‘b’ and replace it with NaN. The value
parameter should be None to use a nested dict in this
way. You can nest regular expressions as well. Note that
column names (the top-level dictionary keys in a nested
dictionary) cannot be regular expressions.
None:
This means that the regex argument must be a string,
compiled regular expression, or list, dict, ndarray or
Series of such elements. If value is also None then
this must be a nested dictionary or Series.
See the examples section for examples of each of these.
valuescalar, dict, list, str, regex, default None
Value to replace any values matching to_replace with.
For a DataFrame a dict of values can be used to specify which
value to use for each column (columns not in the dict will not be
filled). Regular expressions, strings and lists or dicts of such
objects are also allowed.
inplacebool, default False
If True, performs operation inplace and returns None.
limitint, default None
Maximum size gap to forward or backward fill.
regexbool or same types as to_replace, default False
Whether to interpret to_replace and/or value as regular
expressions. If this is True then to_replace must be a
string. Alternatively, this could be a regular expression or a
list, dict, or array of regular expressions in which case
to_replace must be None.
method{‘pad’, ‘ffill’, ‘bfill’, None}
The method to use when for replacement, when to_replace is a
scalar, list or tuple and value is None.
Regex substitution is performed under the hood with re.sub. The
rules for substitution for re.sub are the same.
Regular expressions will only substitute on strings, meaning you
cannot provide, for example, a regular expression matching floating
point numbers and expect the columns in your frame that have a
numeric dtype to be matched. However, if those floating point
numbers are strings, then you can do this.
This method has a lot of options. You are encouraged to experiment
and play with this method to gain intuition about how it works.
When dict is used as the to_replace value, it is like
key(s) in the dict are the to_replace part and
value(s) in the dict are the value parameter.
>>> df.replace({0:10,1:100}) A B C0 10 5 a1 100 6 b2 2 7 c3 3 8 d4 4 9 e
>>> df.replace({'A':0,'B':5},100) A B C0 100 100 a1 1 6 b2 2 7 c3 3 8 d4 4 9 e
>>> df.replace({'A':{0:100,4:400}}) A B C0 100 5 a1 1 6 b2 2 7 c3 3 8 d4 400 9 e
Regular expression `to_replace`
>>> df=pd.DataFrame({'A':['bat','foo','bait'],... 'B':['abc','bar','xyz']})>>> df.replace(to_replace=r'^ba.$',value='new',regex=True) A B0 new abc1 foo new2 bait xyz
>>> df.replace({'A':r'^ba.$'},{'A':'new'},regex=True) A B0 new abc1 foo bar2 bait xyz
>>> df.replace(regex=r'^ba.$',value='new') A B0 new abc1 foo new2 bait xyz
>>> df.replace(regex={r'^ba.$':'new','foo':'xyz'}) A B0 new abc1 xyz new2 bait xyz
>>> df.replace(regex=[r'^ba.$','foo'],value='new') A B0 new abc1 new new2 bait xyz
Compare the behavior of s.replace({'a':None}) and
s.replace('a',None) to understand the peculiarities
of the to_replace parameter:
>>> s=pd.Series([10,'a','a','b','a'])
When one uses a dict as the to_replace value, it is like the
value(s) in the dict are equal to the value parameter.
s.replace({'a':None}) is equivalent to
s.replace(to_replace={'a':None},value=None,method=None):
When value=None and to_replace is a scalar, list or
tuple, replace uses the method parameter (default ‘pad’) to do the
replacement. So this is why the ‘a’ values are being replaced by 10
in rows 1 and 2 and ‘b’ in row 4 in this case.
The command s.replace('a',None) is actually equivalent to
s.replace(to_replace='a',value=None,method='pad'):
Shift index by desired number of periods with an optional time freq.
When freq is not passed, shift the index without realigning the data.
If freq is passed (in this case, the index must be date or datetime,
or it will raise a NotImplementedError), the index will be
increased using the periods and the freq. freq can be inferred
when specified as “infer” as long as either freq or inferred_freq
attribute is set in the index.
Number of periods to shift. Can be positive or negative.
freqDateOffset, tseries.offsets, timedelta, or str, optional
Offset to use from the tseries module or time rule (e.g. ‘EOM’).
If freq is specified then the index values are shifted but the
data is not realigned. That is, use freq if you would like to
extend the index when shifting and preserve the original data.
If freq is specified as “infer” then it will be inferred from
the freq or inferred_freq attributes of the index. If neither of
those attributes exist, a ValueError is thrown.
axis{0 or ‘index’, 1 or ‘columns’, None}, default None
Shift direction.
fill_valueobject, optional
The scalar value to use for newly introduced missing values.
the default depends on the dtype of self.
For numeric data, np.nan is used.
For datetime, timedelta, or period data, etc. NaT is used.
For extension dtypes, self.dtype.na_value is used.
Index.shift : Shift values of Index.
DatetimeIndex.shift : Shift values of DatetimeIndex.
PeriodIndex.shift : Shift values of PeriodIndex.
tshift : Shift the time index, using the index’s frequency if
>>> df.shift(periods=3) Col1 Col2 Col32020-01-01 NaN NaN NaN2020-01-02 NaN NaN NaN2020-01-03 NaN NaN NaN2020-01-04 10.0 13.0 17.02020-01-05 20.0 23.0 27.0
>>> df.shift(periods=1,axis="columns") Col1 Col2 Col32020-01-01 NaN 10 132020-01-02 NaN 20 232020-01-03 NaN 15 182020-01-04 NaN 30 332020-01-05 NaN 45 48
Set the DataFrame index (row labels) using one or more existing
columns or arrays (of the correct length). The index can replace the
existing index or expand on it.
This parameter can be either a single column key, a single array of
the same length as the calling DataFrame, or a list containing an
arbitrary combination of column keys and arrays. Here, “array”
encompasses Series, Index, np.ndarray, and
instances of Iterator.
dropbool, default True
Delete columns to be used as the new index.
appendbool, default False
Whether to append columns to existing index.
inplacebool, default False
If True, modifies the DataFrame in place (do not create a new object).
verify_integritybool, default False
Check the new index for duplicates. Otherwise defer the check until
necessary. Setting to False will improve the performance of this
method.
DataFrame.reset_index : Opposite of set_index.
DataFrame.reindex : Change to new indices or expand indices.
DataFrame.reindex_like : Change to same indices as other DataFrame.
DataFrame.set_index : Opposite of reset_index.
DataFrame.reindex : Change to new indices or expand indices.
DataFrame.reindex_like : Change to same indices as other DataFrame.
>>> df=pd.DataFrame([('bird',389.0),... ('bird',24.0),... ('mammal',80.5),... ('mammal',np.nan)],... index=['falcon','parrot','lion','monkey'],... columns=('class','max_speed'))>>> df class max_speedfalcon bird 389.0parrot bird 24.0lion mammal 80.5monkey mammal NaN
When we reset the index, the old index is added as a column, and a
new sequential index is used:
>>> df.reset_index() index class max_speed0 falcon bird 389.01 parrot bird 24.02 lion mammal 80.53 monkey mammal NaN
We can use the drop parameter to avoid the old index being added as
a column:
>>> df.reset_index(drop=True) class max_speed0 bird 389.01 bird 24.02 mammal 80.53 mammal NaN
You can also use reset_index with MultiIndex.
>>> index=pd.MultiIndex.from_tuples([('bird','falcon'),... ('bird','parrot'),... ('mammal','lion'),... ('mammal','monkey')],... names=['class','name'])>>> columns=pd.MultiIndex.from_tuples([('speed','max'),... ('species','type')])>>> df=pd.DataFrame([(389.0,'fly'),... (24.0,'fly'),... (80.5,'run'),... (np.nan,'jump')],... index=index,... columns=columns)>>> df speed species max typeclass namebird falcon 389.0 fly parrot 24.0 flymammal lion 80.5 run monkey NaN jump
If the index has multiple levels, we can reset a subset of them:
>>> df.reset_index(level='class') class speed species max typenamefalcon bird 389.0 flyparrot bird 24.0 flylion mammal 80.5 runmonkey mammal NaN jump
If we are not dropping the index, by default, it is placed in the top
level. We can place it in another level:
>>> df.reset_index(level='class',col_level=1) speed species class max typenamefalcon bird 389.0 flyparrot bird 24.0 flylion mammal 80.5 runmonkey mammal NaN jump
When the index is inserted under another level, we can specify under
which one with the parameter col_fill:
>>> df.reset_index(level='class',col_level=1,col_fill='species') species speed species class max typenamefalcon bird 389.0 flyparrot bird 24.0 flylion mammal 80.5 runmonkey mammal NaN jump
If we specify a nonexistent level for col_fill, it is created:
>>> df.reset_index(level='class',col_level=1,col_fill='genus') genus speed species class max typenamefalcon bird 389.0 flyparrot bird 24.0 flylion mammal 80.5 runmonkey mammal NaN jump
Return a boolean same-sized object indicating if the values are NA.
NA values, such as None or numpy.NaN, gets mapped to True
values.
Everything else gets mapped to False values. Characters such as empty
strings '' or numpy.inf are not considered NA values
(unless you set pandas.options.mode.use_inf_as_na=True).
>>> df=pd.DataFrame(dict(age=[5,6,np.NaN],... born=[pd.NaT,pd.Timestamp('1939-05-27'),... pd.Timestamp('1940-04-25')],... name=['Alfred','Batman',''],... toy=[None,'Batmobile','Joker']))>>> df age born name toy0 5.0 NaT Alfred None1 6.0 1939-05-27 Batman Batmobile2 NaN 1940-04-25 Joker
>>> df.isna() age born name toy0 False True False True1 False False False False2 True False False False
Return a boolean same-sized object indicating if the values are NA.
NA values, such as None or numpy.NaN, gets mapped to True
values.
Everything else gets mapped to False values. Characters such as empty
strings '' or numpy.inf are not considered NA values
(unless you set pandas.options.mode.use_inf_as_na=True).
>>> df=pd.DataFrame(dict(age=[5,6,np.NaN],... born=[pd.NaT,pd.Timestamp('1939-05-27'),... pd.Timestamp('1940-04-25')],... name=['Alfred','Batman',''],... toy=[None,'Batmobile','Joker']))>>> df age born name toy0 5.0 NaT Alfred None1 6.0 1939-05-27 Batman Batmobile2 NaN 1940-04-25 Joker
>>> df.isna() age born name toy0 False True False True1 False False False False2 True False False False
Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings '' or numpy.inf are not considered NA values
(unless you set pandas.options.mode.use_inf_as_na=True).
NA values, such as None or numpy.NaN, get mapped to False
values.
>>> df=pd.DataFrame(dict(age=[5,6,np.NaN],... born=[pd.NaT,pd.Timestamp('1939-05-27'),... pd.Timestamp('1940-04-25')],... name=['Alfred','Batman',''],... toy=[None,'Batmobile','Joker']))>>> df age born name toy0 5.0 NaT Alfred None1 6.0 1939-05-27 Batman Batmobile2 NaN 1940-04-25 Joker
>>> df.notna() age born name toy0 True False True False1 True True True True2 False True True True
Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings '' or numpy.inf are not considered NA values
(unless you set pandas.options.mode.use_inf_as_na=True).
NA values, such as None or numpy.NaN, get mapped to False
values.
>>> df=pd.DataFrame(dict(age=[5,6,np.NaN],... born=[pd.NaT,pd.Timestamp('1939-05-27'),... pd.Timestamp('1940-04-25')],... name=['Alfred','Batman',''],... toy=[None,'Batmobile','Joker']))>>> df age born name toy0 5.0 NaT Alfred None1 6.0 1939-05-27 Batman Batmobile2 NaN 1940-04-25 Joker
>>> df.notna() age born name toy0 True False True False1 True True True True2 False True True True
>>> df=pd.DataFrame({"name":['Alfred','Batman','Catwoman'],... "toy":[np.nan,'Batmobile','Bullwhip'],... "born":[pd.NaT,pd.Timestamp("1940-04-25"),... pd.NaT]})>>> df name toy born0 Alfred NaN NaT1 Batman Batmobile 1940-04-252 Catwoman Bullwhip NaT
Drop the rows where at least one element is missing.
>>> df.dropna() name toy born1 Batman Batmobile 1940-04-25
Drop the columns where at least one element is missing.
subsetcolumn label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by
default use all of the columns.
keep{‘first’, ‘last’, False}, default ‘first’
Determines which duplicates (if any) to keep.
- first : Drop duplicates except for the first occurrence.
- last : Drop duplicates except for the last occurrence.
- False : Drop all duplicates.
inplacebool, default False
Whether to drop duplicates in place or to return a copy.
ignore_indexbool, default False
If True, the resulting axis will be labeled 0, 1, …, n - 1.
Choice of sorting algorithm. See also numpy.sort() for more
information. mergesort and stable are the only stable algorithms. For
DataFrames, this option is only applied when sorting on a single
column or label.
na_position{‘first’, ‘last’}, default ‘last’
Puts NaNs at the beginning if first; last puts NaNs at the
end.
ignore_indexbool, default False
If True, the resulting axis will be labeled 0, 1, …, n - 1.
New in version 1.0.0.
keycallable, optional
Apply the key function to the values
before sorting. This is similar to the key argument in the
builtin sorted() function, with the notable difference that
this key function should be vectorized. It should expect a
Series and return a Series with the same shape as the input.
It will be applied to each column in by independently.
Choice of sorting algorithm. See also numpy.sort() for more
information. mergesort and stable are the only stable algorithms. For
DataFrames, this option is only applied when sorting on a single
column or label.
na_position{‘first’, ‘last’}, default ‘last’
Puts NaNs at the beginning if first; last puts NaNs at the end.
Not implemented for MultiIndex.
sort_remainingbool, default True
If True and sorting by level and index is multilevel, sort by other
levels too (in order) after sorting by specified level.
ignore_indexbool, default False
If True, the resulting axis will be labeled 0, 1, …, n - 1.
New in version 1.0.0.
keycallable, optional
If not None, apply the key function to the index values
before sorting. This is similar to the key argument in the
builtin sorted() function, with the notable difference that
this key function should be vectorized. It should expect an
Index and return an Index of the same shape. For MultiIndex
inputs, the key is applied per level.
The returned Series will have a MultiIndex with one level per input
column. By default, rows that contain any NA values are omitted from
the result. By default, the resulting Series will be in descending
order so that the first element is the most frequently-occurring row.
With dropna set to False we can also count rows with NA values.
>>> df=pd.DataFrame({'first_name':['John','Anne','John','Beth'],... 'middle_name':['Smith',pd.NA,pd.NA,'Louise']})>>> df first_name middle_name0 John Smith1 Anne <NA>2 John <NA>3 Beth Louise
>>> df.value_counts()first_name middle_nameBeth Louise 1John Smith 1dtype: int64
>>> df.value_counts(dropna=False)first_name middle_nameAnne NaN 1Beth Louise 1John Smith 1 NaN 1dtype: int64
Return the first n rows ordered by columns in descending order.
Return the first n rows with the largest values in columns, in
descending order. The columns that are not specified are returned as
well, but not used for ordering.
This method is equivalent to
df.sort_values(columns,ascending=False).head(n), but more
performant.
Return the first n rows ordered by columns in ascending order.
Return the first n rows with the smallest values in columns, in
ascending order. The columns that are not specified are returned as
well, but not used for ordering.
This method is equivalent to
df.sort_values(columns,ascending=True).head(n), but more
performant.
>>> df=pd.DataFrame(... {"Grade":["A","B","A","C"]},... index=[... ["Final exam","Final exam","Coursework","Coursework"],... ["History","Geography","History","Geography"],... ["January","February","March","April"],... ],... )>>> df GradeFinal exam History January A Geography February BCoursework History March A Geography April C
In the following example, we will swap the levels of the indices.
Here, we will swap the levels column-wise, but levels can be swapped row-wise
in a similar manner. Note that column-wise is the default behaviour.
By not supplying any arguments for i and j, we swap the last and second to
last indices.
>>> df.swaplevel() GradeFinal exam January History A February Geography BCoursework March History A April Geography C
By supplying one argument, we can choose which index to swap the last
index with. We can for example swap the first index with the last one as
follows.
>>> df.swaplevel(0) GradeJanuary History Final exam AFebruary Geography Final exam BMarch History Coursework AApril Geography Coursework C
We can also define explicitly which indices we want to swap by supplying values
for both i and j. Here, we for example swap the first and second indices.
>>> df.swaplevel(0,1) GradeHistory Final exam January AGeography Final exam February BHistory Coursework March AGeography Coursework April C
>>> df=pd.DataFrame(... {... "col1":["a","a","b","b","a"],... "col2":[1.0,2.0,3.0,np.nan,5.0],... "col3":[1.0,2.0,3.0,4.0,5.0]... },... columns=["col1","col2","col3"],... )>>> df col1 col2 col30 a 1.0 1.01 a 2.0 2.02 b 3.0 3.03 b NaN 4.04 a 5.0 5.0
>>> df2=df.copy()>>> df2.loc[0,'col1']='c'>>> df2.loc[2,'col3']=4.0>>> df2 col1 col2 col30 c 1.0 1.01 a 2.0 2.02 b 3.0 4.03 b NaN 4.04 a 5.0 5.0
Align the differences on columns
>>> df.compare(df2) col1 col3 self other self other0 a c NaN NaN2 NaN NaN 3.0 4.0
Stack the differences on rows
>>> df.compare(df2,align_axis=0) col1 col30 self a NaN other c NaN2 self NaN 3.0 other NaN 4.0
Keep the equal values
>>> df.compare(df2,keep_equal=True) col1 col3 self other self other0 a c 1.0 1.02 b b 3.0 4.0
Keep all original rows and columns
>>> df.compare(df2,keep_shape=True) col1 col2 col3 self other self other self other0 a c NaN NaN NaN NaN1 NaN NaN NaN NaN NaN NaN2 NaN NaN NaN NaN 3.0 4.03 NaN NaN NaN NaN NaN NaN4 NaN NaN NaN NaN NaN NaN
Keep all original rows and columns and also all original values
>>> df.compare(df2,keep_shape=True,keep_equal=True) col1 col2 col3 self other self other self other0 a c 1.0 1.0 1.0 1.01 a a 2.0 2.0 2.0 2.02 b b 3.0 3.0 3.0 4.03 b b NaN NaN 4.0 4.04 a a 5.0 5.0 5.0 5.0
Perform column-wise combine with another DataFrame.
Combines a DataFrame with other DataFrame using func
to element-wise combine columns. The row and column indexes of the
resulting DataFrame will be the union of the two.
Example using a true element-wise combine function.
>>> df1=pd.DataFrame({'A':[5,0],'B':[2,4]})>>> df2=pd.DataFrame({'A':[1,1],'B':[3,3]})>>> df1.combine(df2,np.minimum) A B0 1 21 0 3
Using fill_value fills Nones prior to passing the column to the
merge function.
>>> df1=pd.DataFrame({'A':[0,0],'B':[None,4]})>>> df2=pd.DataFrame({'A':[1,1],'B':[3,3]})>>> df1.combine(df2,take_smaller,fill_value=-5) A B0 0 -5.01 0 4.0
However, if the same element in both dataframes is None, that None
is preserved
>>> df1=pd.DataFrame({'A':[0,0],'B':[None,4]})>>> df2=pd.DataFrame({'A':[1,1],'B':[None,3]})>>> df1.combine(df2,take_smaller,fill_value=-5) A B0 0 -5.01 0 3.0
Example that demonstrates the use of overwrite and behavior when
the axis differ between the dataframes.
>>> df1=pd.DataFrame({'A':[0,0],'B':[4,4]})>>> df2=pd.DataFrame({'B':[3,3],'C':[-10,1],},index=[1,2])>>> df1.combine(df2,take_smaller) A B C0 NaN NaN NaN1 NaN 3.0 -10.02 NaN 3.0 1.0
>>> df1.combine(df2,take_smaller,overwrite=False) A B C0 0.0 NaN NaN1 0.0 3.0 -10.02 NaN 3.0 1.0
Demonstrating the preference of the passed in dataframe.
>>> df2=pd.DataFrame({'B':[3,3],'C':[1,1],},index=[1,2])>>> df2.combine(df1,take_smaller) A B C0 0.0 NaN NaN1 0.0 3.0 NaN2 NaN 3.0 NaN
>>> df2.combine(df1,take_smaller,overwrite=False) A B C0 0.0 NaN NaN1 0.0 3.0 1.02 NaN 3.0 1.0
Update null elements with value in the same location in other.
Combine two DataFrame objects by filling null values in one DataFrame
with non-null values from other DataFrame. The row and column indexes
of the resulting DataFrame will be the union of the two.
>>> df1=pd.DataFrame({'A':[None,0],'B':[None,4]})>>> df2=pd.DataFrame({'A':[1,1],'B':[3,3]})>>> df1.combine_first(df2) A B0 1.0 3.01 0.0 4.0
Null values still persist if the location of that null value
does not exist in other
>>> df1=pd.DataFrame({'A':[None,0],'B':[4,None]})>>> df2=pd.DataFrame({'B':[3,3],'C':[1,1]},index=[1,2])>>> df1.combine_first(df2) A B C0 NaN 4.0 NaN1 0.0 3.0 1.02 NaN 3.0 1.0
otherDataFrame, or object coercible into a DataFrame
Should have at least one matching index/column label
with the original DataFrame. If a Series is passed,
its name attribute must be set, and that will be
used as the column name to align with the original DataFrame.
join{‘left’}, default ‘left’
Only left join is implemented, keeping the index and columns of the
original object.
overwritebool, default True
How to handle non-NA values for overlapping keys:
True: overwrite original DataFrame’s values
with values from other.
False: only update values that are NA in
the original DataFrame.
The DataFrame’s length does not increase as a result of the update,
only values at matching index/column labels are updated.
>>> df=pd.DataFrame({'A':['a','b','c'],... 'B':['x','y','z']})>>> new_df=pd.DataFrame({'B':['d','e','f','g','h','i']})>>> df.update(new_df)>>> df A B0 a d1 b e2 c f
For Series, its name attribute must be set.
>>> df=pd.DataFrame({'A':['a','b','c'],... 'B':['x','y','z']})>>> new_column=pd.Series(['d','e'],name='B',index=[0,2])>>> df.update(new_column)>>> df A B0 a d1 b y2 c e>>> df=pd.DataFrame({'A':['a','b','c'],... 'B':['x','y','z']})>>> new_df=pd.DataFrame({'B':['d','e']},index=[1,2])>>> df.update(new_df)>>> df A B0 a x1 b d2 c e
If other contains NaNs the corresponding values are not updated
in the original dataframe.
Group DataFrame using a mapper or by a Series of columns.
A groupby operation involves some combination of splitting the
object, applying a function, and combining the results. This can be
used to group large amounts of data and compute operations on these
groups.
Used to determine the groups for the groupby.
If by is a function, it’s called on each value of the object’s
index. If a dict or Series is passed, the Series or dict VALUES
will be used to determine the groups (the Series’ values are first
aligned; see .align() method). If an ndarray is passed, the
values are used as-is to determine the groups. A label or list of
labels may be passed to group by the columns in self. Notice
that a tuple is interpreted as a (single) key.
axis{0 or ‘index’, 1 or ‘columns’}, default 0
Split along rows (0) or columns (1).
levelint, level name, or sequence of such, default None
If the axis is a MultiIndex (hierarchical), group by a particular
level or levels.
as_indexbool, default True
For aggregated output, return object with group labels as the
index. Only relevant for DataFrame input. as_index=False is
effectively “SQL-style” grouped output.
sortbool, default True
Sort group keys. Get better performance by turning this off.
Note this does not influence the order of observations within each
group. Groupby preserves the order of rows within each group.
group_keysbool, default True
When calling apply, add group keys to index to identify pieces.
squeezebool, default False
Reduce the dimensionality of the return type if possible,
otherwise return a consistent type.
Deprecated since version 1.1.0.
observedbool, default False
This only applies if any of the groupers are Categoricals.
If True: only show observed values for categorical groupers.
If False: show all values for categorical groupers.
dropnabool, default True
If True, and if group keys contain NA values, NA values together
with row/column will be dropped.
If False, NA values will also be treated as the key in groups
Return reshaped DataFrame organized by given index / column values.
Reshape data (produce a “pivot” table) based on column values. Uses
unique values from specified index / columns to form axes of the
resulting DataFrame. This function does not support data
aggregation, multiple values will result in a MultiIndex in the
columns. See the User Guide for more on reshaping.
Column to use to make new frame’s index. If None, uses
existing index.
Changed in version 1.1.0: Also accept list of index names.
columnsstr or object or a list of str
Column to use to make new frame’s columns.
Changed in version 1.1.0: Also accept list of columns names.
valuesstr, object or a list of the previous, optional
Column(s) to use for populating new frame’s values. If not
specified, all remaining columns will be used and the result will
have hierarchically indexed columns.
>>> df=pd.DataFrame({'foo':['one','one','one','two','two',... 'two'],... 'bar':['A','B','C','A','B','C'],... 'baz':[1,2,3,4,5,6],... 'zoo':['x','y','z','q','w','t']})>>> df foo bar baz zoo0 one A 1 x1 one B 2 y2 one C 3 z3 two A 4 q4 two B 5 w5 two C 6 t
>>> df.pivot(index='foo',columns='bar',values='baz')bar A B Cfooone 1 2 3two 4 5 6
>>> df.pivot(index='foo',columns='bar')['baz']bar A B Cfooone 1 2 3two 4 5 6
>>> df.pivot(index='foo',columns='bar',values=['baz','zoo']) baz zoobar A B C A B Cfooone 1 2 3 x y ztwo 4 5 6 q w t
You could also assign a list of column names or a list of index names.
>>> df.pivot(index=["lev1","lev2"],columns=["lev3"],values="values") lev3 1 2lev1 lev2 1 1 0.0 1.0 2 2.0 NaN 2 1 4.0 3.0 2 NaN 5.0
A ValueError is raised if there are any duplicates.
>>> df=pd.DataFrame({"foo":['one','one','two','two'],... "bar":['A','A','B','C'],... "baz":[1,2,3,4]})>>> df foo bar baz0 one A 11 one A 22 two B 33 two C 4
Notice that the first two rows are the same for our index
and columns arguments.
values : column to aggregate, optional
index : column, Grouper, array, or list of the previous
If an array is passed, it must be the same length as the data. The
list can contain any of the other types (except list).
Keys to group by on the pivot table index. If an array is passed,
it is being used as the same manner as column values.
columnscolumn, Grouper, array, or list of the previous
If an array is passed, it must be the same length as the data. The
list can contain any of the other types (except list).
Keys to group by on the pivot table column. If an array is passed,
it is being used as the same manner as column values.
aggfuncfunction, list of functions, dict, default numpy.mean
If list of functions passed, the resulting pivot table will have
hierarchical columns whose top level are the function names
(inferred from the function objects themselves)
If dict is passed, the key is column to aggregate and value
is function or list of functions.
fill_valuescalar, default None
Value to replace missing values with (in the resulting pivot table,
after aggregation).
marginsbool, default False
Add all row / columns (e.g. for subtotal / grand totals).
dropnabool, default True
Do not include columns whose entries are all NaN.
margins_namestr, default ‘All’
Name of the row / column that will contain the totals
when margins is True.
observedbool, default False
This only applies if any of the groupers are Categoricals.
If True: only show observed values for categorical groupers.
If False: show all values for categorical groupers.
>>> df=pd.DataFrame({"A":["foo","foo","foo","foo","foo",... "bar","bar","bar","bar"],... "B":["one","one","one","two","two",... "one","one","two","two"],... "C":["small","large","large","small",... "small","large","small","small",... "large"],... "D":[1,2,2,3,3,4,5,6,7],... "E":[2,4,5,5,6,6,8,9,9]})>>> df A B C D E0 foo one small 1 21 foo one large 2 42 foo one large 2 53 foo two small 3 54 foo two small 3 65 bar one large 4 66 bar one small 5 87 bar two small 6 98 bar two large 7 9
This first example aggregates values by taking the sum.
>>> table=pd.pivot_table(df,values='D',index=['A','B'],... columns=['C'],aggfunc=np.sum)>>> tableC large smallA Bbar one 4.0 5.0 two 7.0 6.0foo one 4.0 1.0 two NaN 6.0
We can also fill missing values using the fill_value parameter.
>>> table=pd.pivot_table(df,values='D',index=['A','B'],... columns=['C'],aggfunc=np.sum,fill_value=0)>>> tableC large smallA Bbar one 4 5 two 7 6foo one 4 1 two 0 6
The next example aggregates by taking the mean across multiple columns.
>>> table=pd.pivot_table(df,values=['D','E'],index=['A','C'],... aggfunc={'D':np.mean,... 'E':np.mean})>>> table D EA Cbar large 5.500000 7.500000 small 5.500000 8.500000foo large 2.000000 4.500000 small 2.333333 4.333333
We can also calculate multiple types of aggregations for any given
value column.
>>> table=pd.pivot_table(df,values=['D','E'],index=['A','C'],... aggfunc={'D':np.mean,... 'E':[min,max,np.mean]})>>> table D E mean max mean minA Cbar large 5.500000 9.0 7.500000 6.0 small 5.500000 9.0 8.500000 8.0foo large 2.000000 5.0 4.500000 4.0 small 2.333333 6.0 4.333333 2.0
Stack the prescribed level(s) from columns to index.
Return a reshaped DataFrame or Series having a multi-level
index with one or more new inner-most levels compared to the current
DataFrame. The new inner-most levels are created by pivoting the
columns of the current dataframe:
if the columns have a single level, the output is a Series;
if the columns have multiple levels, the new index
level(s) is (are) taken from the prescribed level(s) and
the output is a DataFrame.
Level(s) to stack from the column axis onto the index
axis, defined as one index or label, or a list of indices
or labels.
dropnabool, default True
Whether to drop rows in the resulting Frame/Series with
missing values. Stacking a column level onto the index
axis can create combinations of index and column values
that are missing from the original dataframe. See Examples
section.
The function is named by analogy with a collection of books
being reorganized from being side by side on a horizontal
position (the columns of the dataframe) to being stacked
vertically on top of each other (in the index of the
dataframe).
It is common to have missing values when stacking a dataframe
with multi-level columns, as the stacked dataframe typically
has more values than the original dataframe. Missing values
are filled with NaNs:
>>> df_multi_level_cols2 weight height kg mcat 1.0 2.0dog 3.0 4.0>>> df_multi_level_cols2.stack() height weightcat kg NaN 1.0 m 2.0 NaNdog kg NaN 3.0 m 4.0 NaN
Prescribing the level(s) to be stacked
The first parameter controls which level or levels are stacked:
>>> df_multi_level_cols2.stack(0) kg mcat height NaN 2.0 weight 1.0 NaNdog height NaN 4.0 weight 3.0 NaN>>> df_multi_level_cols2.stack([0,1])cat height m 2.0 weight kg 1.0dog height m 4.0 weight kg 3.0dtype: float64
Note that rows where all values are missing are dropped by
default but this behaviour can be controlled via the dropna
keyword parameter:
>>> df_multi_level_cols3 weight height kg mcat NaN 1.0dog 2.0 3.0>>> df_multi_level_cols3.stack(dropna=False) height weightcat kg NaN NaN m 1.0 NaNdog kg NaN 2.0 m 3.0 NaN>>> df_multi_level_cols3.stack(dropna=True) height weightcat m 1.0 NaNdog kg NaN 2.0 m 3.0 NaN
Column(s) to explode.
For multiple columns, specify a non-empty list with each element
be str or tuple, and all specified columns their list-like data
on same row of the frame must have matching length.
New in version 1.3.0: Multi-column explode
ignore_indexbool, default False
If True, the resulting index will be labeled 0, 1, …, n - 1.
This routine will explode list-likes including lists, tuples, sets,
Series, and np.ndarray. The result dtype of the subset rows will
be object. Scalars will be returned unchanged, and empty list-likes will
result in a np.nan for that row. In addition, the ordering of rows in the
output will be non-deterministic when exploding sets.
>>> index=pd.MultiIndex.from_tuples([('one','a'),('one','b'),... ('two','a'),('two','b')])>>> s=pd.Series(np.arange(1.0,5.0),index=index)>>> sone a 1.0 b 2.0two a 3.0 b 4.0dtype: float64
>>> s.unstack(level=-1) a bone 1.0 2.0two 3.0 4.0
>>> s.unstack(level=0) one twoa 1.0 3.0b 2.0 4.0
>>> df=s.unstack(level=0)>>> df.unstack()one a 1.0 b 2.0two a 3.0 b 4.0dtype: float64
Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
This function is useful to massage a DataFrame into a format where one
or more columns are identifier variables (id_vars), while all other
columns, considered measured variables (value_vars), are “unpivoted” to
the row axis, leaving just two non-identifier columns, ‘variable’ and
‘value’.
For boolean dtypes, this uses operator.xor() rather than
operator.sub().
The result is calculated according to current dtype in Dataframe,
however dtype of the result is always float64.
scalar : when Series.agg is called with single function
Series : when DataFrame.agg is called with a single function
DataFrame : when DataFrame.agg is called with several functions
Return scalar, Series or DataFrame.
The aggregation operations are always performed over an axis, either the
index (default) or the column axis. This behavior is different from
numpy aggregation functions (mean, median, prod, sum, std,
var), where the default is to compute the aggregation of the flattened
array, e.g., numpy.mean(arr_2d) as opposed to
numpy.mean(arr_2d,axis=0).
DataFrame.apply : Perform any type of operations.
DataFrame.transform : Perform transformation type operations.
core.groupby.GroupBy : Perform operations over groups.
core.resample.Resampler : Perform operations over resampled bins.
core.window.Rolling : Perform operations over rolling window.
core.window.Expanding : Perform operations over expanding window.
core.window.ExponentialMovingWindow : Perform operation over exponential weighted
scalar : when Series.agg is called with single function
Series : when DataFrame.agg is called with a single function
DataFrame : when DataFrame.agg is called with several functions
Return scalar, Series or DataFrame.
The aggregation operations are always performed over an axis, either the
index (default) or the column axis. This behavior is different from
numpy aggregation functions (mean, median, prod, sum, std,
var), where the default is to compute the aggregation of the flattened
array, e.g., numpy.mean(arr_2d) as opposed to
numpy.mean(arr_2d,axis=0).
DataFrame.apply : Perform any type of operations.
DataFrame.transform : Perform transformation type operations.
core.groupby.GroupBy : Perform operations over groups.
core.resample.Resampler : Perform operations over resampled bins.
core.window.Rolling : Perform operations over rolling window.
core.window.Expanding : Perform operations over expanding window.
core.window.ExponentialMovingWindow : Perform operation over exponential weighted
Function to use for transforming the data. If a function, must either
work when passed a DataFrame or when passed to DataFrame.apply. If func
is both list-like and dict-like, dict-like behavior takes precedence.
Accepted combinations are:
function
string function name
list-like of functions and/or function names, e.g. [np.exp,'sqrt']
dict-like of axis labels -> functions, function names or list-like of such.
axis{0 or ‘index’, 1 or ‘columns’}, default 0
If 0 or ‘index’: apply function to each column.
If 1 or ‘columns’: apply function to each row.
>>> df=pd.DataFrame({... "c":[1,1,1,2,2,2,2],... "type":["m","n","o","m","m","n","n"]... })>>> df c type0 1 m1 1 n2 1 o3 2 m4 2 m5 2 n6 2 n>>> df['size']=df.groupby('c')['type'].transform(len)>>> df c type size0 1 m 31 1 n 32 1 o 33 2 m 44 2 m 45 2 n 46 2 n 4
Objects passed to the function are Series objects whose index is
either the DataFrame’s index (axis=0) or the DataFrame’s columns
(axis=1). By default (result_type=None), the final return type
is inferred from the return type of the applied function. Otherwise,
it depends on the result_type argument.
Determines if row or column is passed as a Series or ndarray object:
False : passes each row or column as a Series to the
function.
True : the passed function will receive ndarray objects
instead.
If you are just applying a NumPy reduction function this will
achieve much better performance.
‘expand’ : list-like results will be turned into columns.
‘reduce’ : returns a Series if possible rather than expanding
list-like results. This is the opposite of ‘expand’.
‘broadcast’ : results will be broadcast to the original shape
of the DataFrame, the original index and columns will be
retained.
The default behaviour (None) depends on the return value of the
applied function: list-like results will be returned as a Series
of those. However if the apply function returns a Series these
are expanded to columns.
argstuple
Positional arguments to pass to func in addition to the
array/series.
DataFrame.applymap: For elementwise operations.
DataFrame.aggregate: Only perform aggregating type operations.
DataFrame.transform: Only perform transforming type operations.
Passing result_type='broadcast' will ensure the same shape
result, whether list-like or scalar is returned by the function,
and broadcast it along the axis. The resulting column names will
be the originals.
>>> df.apply(lambdax:[1,2],axis=1,result_type='broadcast') A B0 1 21 1 22 1 2
If a list of dict/series is passed and the keys are all contained in
the DataFrame’s index, the order of the columns in the resulting
DataFrame will be unchanged.
Iteratively appending rows to a DataFrame can be more computationally
intensive than a single concatenate. A better solution is to append
those rows to a list and then concatenate the list with the original
DataFrame all at once.
Index should be similar to one of the columns in this one. If a
Series is passed, its name attribute must be set, and that will be
used as the column name in the resulting joined DataFrame.
onstr, list of str, or array-like, optional
Column or index level name(s) in the caller to join on the index
in other, otherwise joins index-on-index. If multiple
values given, the other DataFrame must have a MultiIndex. Can
pass an array as the join key if it is not already contained in
the calling DataFrame. Like an Excel VLOOKUP operation.
>>> df.join(other,lsuffix='_caller',rsuffix='_other') key_caller A key_other B0 K0 A0 K0 B01 K1 A1 K1 B12 K2 A2 K2 B23 K3 A3 NaN NaN4 K4 A4 NaN NaN5 K5 A5 NaN NaN
If we want to join using the key columns, we need to set key to be
the index in both df and other. The joined DataFrame will have
key as its index.
>>> df.set_index('key').join(other.set_index('key')) A BkeyK0 A0 B0K1 A1 B1K2 A2 B2K3 A3 NaNK4 A4 NaNK5 A5 NaN
Another option to join using the key columns is to use the on
parameter. DataFrame.join always uses other’s index but we can use
any column in df. This method preserves the original DataFrame’s
index in the result.
Merge DataFrame or named Series objects with a database-style join.
A named Series object is treated as a DataFrame with a single named column.
The join is done on columns or indexes. If joining columns on
columns, the DataFrame indexes will be ignored. Otherwise if joining indexes
on indexes or indexes on a column or columns, the index will be passed on.
When performing a cross merge, no column specifications to merge on are
allowed.
left: use only keys from left frame, similar to a SQL left outer join;
preserve key order.
right: use only keys from right frame, similar to a SQL right outer join;
preserve key order.
outer: use union of keys from both frames, similar to a SQL full outer
join; sort keys lexicographically.
inner: use intersection of keys from both frames, similar to a SQL inner
join; preserve the order of the left keys.
cross: creates the cartesian product from both frames, preserves the order
of the left keys.
New in version 1.2.0.
onlabel or list
Column or index level names to join on. These must be found in both
DataFrames. If on is None and not merging on indexes then this defaults
to the intersection of the columns in both DataFrames.
left_onlabel or list, or array-like
Column or index level names to join on in the left DataFrame. Can also
be an array or list of arrays of the length of the left DataFrame.
These arrays are treated as if they are columns.
right_onlabel or list, or array-like
Column or index level names to join on in the right DataFrame. Can also
be an array or list of arrays of the length of the right DataFrame.
These arrays are treated as if they are columns.
left_indexbool, default False
Use the index from the left DataFrame as the join key(s). If it is a
MultiIndex, the number of keys in the other DataFrame (either the index
or a number of columns) must match the number of levels.
right_indexbool, default False
Use the index from the right DataFrame as the join key. Same caveats as
left_index.
sortbool, default False
Sort the join keys lexicographically in the result DataFrame. If False,
the order of the join keys depends on the join type (how keyword).
suffixeslist-like, default is (“_x”, “_y”)
A length-2 sequence where each element is optionally a string
indicating the suffix to add to overlapping column names in
left and right respectively. Pass a value of None instead
of a string to indicate that the column name from left or
right should be left as-is, with no suffix. At least one of the
values must not be None.
copybool, default True
If False, avoid copy if possible.
indicatorbool or str, default False
If True, adds a column to the output DataFrame called “_merge” with
information on the source of each row. The column can be given a different
name by providing a string argument. The column will have a Categorical
type with the value of “left_only” for observations whose merge key only
appears in the left DataFrame, “right_only” for observations
whose merge key only appears in the right DataFrame, and “both”
if the observation’s merge key is found in both DataFrames.
validatestr, optional
If specified, checks if merge is of specified type.
“one_to_one” or “1:1”: check if merge keys are unique in both
left and right datasets.
“one_to_many” or “1:m”: check if merge keys are unique in left
dataset.
“many_to_one” or “m:1”: check if merge keys are unique in right
dataset.
“many_to_many” or “m:m”: allowed, but does not result in checks.
Support for specifying index levels as the on, left_on, and
right_on parameters was added in version 0.23.0
Support for merging named Series objects was added in version 0.24.0
Number of decimal places to round each column to. If an int is
given, round each column to the same number of places.
Otherwise dict and Series round to variable numbers of places.
Column names should be in the keys if decimals is a
dict-like, or in the index if decimals is a Series. Any
columns not included in decimals will be left as is. Elements
of decimals which are not columns of the input will be
ignored.
method{‘pearson’, ‘kendall’, ‘spearman’} or callable
Method of correlation:
pearson : standard correlation coefficient
kendall : Kendall Tau correlation coefficient
spearman : Spearman rank correlation
callable: callable with input two 1d ndarrays
and returning a float. Note that the returned matrix from corr
will have 1 along the diagonals and will be symmetric
regardless of the callable’s behavior.
min_periodsint, optional
Minimum number of observations required per pair of columns
to have a valid result. Currently only available for Pearson
and Spearman correlation.
Compute pairwise covariance of columns, excluding NA/null values.
Compute the pairwise covariance among the series of a DataFrame.
The returned data frame is the covariance matrix of the columns
of the DataFrame.
Both NA and null values are automatically excluded from the
calculation. (See the note below about bias from missing values.)
A threshold can be set for the minimum number of
observations for each value created. Comparisons with observations
below this threshold will be returned as NaN.
This method is generally used for the analysis of time series data to
understand the relationship between different measures
across time.
Returns the covariance matrix of the DataFrame’s time series.
The covariance is normalized by N-ddof.
For DataFrames that have Series that are missing data (assuming that
data is missing at random)
the returned covariance matrix will be an unbiased estimate
of the variance and covariance between the member Series.
However, for many applications this estimate may not be acceptable
because the estimate covariance matrix is not guaranteed to be positive
semi-definite. This could lead to estimate correlations having
absolute values which are greater than one, and/or a non-invertible
covariance matrix. See Estimation of covariance matrices for more details.
>>> np.random.seed(42)>>> df=pd.DataFrame(np.random.randn(1000,5),... columns=['a','b','c','d','e'])>>> df.cov() a b c d ea 0.998438 -0.020161 0.059277 -0.008943 0.014144b -0.020161 1.059352 -0.008543 -0.024738 0.009826c 0.059277 -0.008543 1.010670 -0.001486 -0.000271d -0.008943 -0.024738 -0.001486 0.921297 -0.013692e 0.014144 0.009826 -0.000271 -0.013692 0.977795
Minimum number of periods
This method also supports an optional min_periods keyword
that specifies the required minimum number of non-NA observations for
each column pair in order to have a valid result:
>>> np.random.seed(42)>>> df=pd.DataFrame(np.random.randn(20,3),... columns=['a','b','c'])>>> df.loc[df.index[:5],'a']=np.nan>>> df.loc[df.index[5:10],'b']=np.nan>>> df.cov(min_periods=12) a b ca 0.316741 NaN -0.150812b NaN 1.248003 0.191417c -0.150812 0.191417 0.895202
Pairwise correlation is computed between rows or columns of
DataFrame with rows or columns of Series or DataFrame. DataFrames
are first aligned along both axes before computing the
correlations.
Series.count: Number of non-NA elements in a Series.
DataFrame.value_counts: Count unique combinations of columns.
DataFrame.shape: Number of DataFrame rows and columns (including NA
elements).
DataFrame.isna: Boolean same-sized DataFrame showing places of NA
>>> df=pd.DataFrame({"Person":... ["John","Myla","Lewis","John","Myla"],... "Age":[24.,np.nan,21.,33,26],... "Single":[False,True,True,True,False]})>>> df Person Age Single0 John 24.0 False1 Myla NaN True2 Lewis 21.0 True3 John 33.0 True4 Myla 26.0 False
>>> df=pd.DataFrame([('bird',2,2),... ('mammal',4,np.nan),... ('arthropod',8,0),... ('bird',2,np.nan)],... index=('falcon','horse','spider','ostrich'),... columns=('species','legs','wings'))>>> df species legs wingsfalcon bird 2 2.0horse mammal 4 NaNspider arthropod 8 0.0ostrich bird 2 NaN
By default, missing values are not considered, and the mode of wings
are both 0 and 2. Because the resulting DataFrame has two rows,
the second row of species and legs contains NaN.
>>> df.mode() species legs wings0 bird 2.0 0.01 NaN NaN 2.0
Setting dropna=FalseNaN values are considered and they can be
the mode (like for wings).
>>> df.mode(dropna=False) species legs wings0 bird 2 NaN
Setting numeric_only=True, only the mode of numeric columns is
computed, and columns of other types are ignored.
>>> df.mode(numeric_only=True) legs wings0 2.0 0.01 NaN 2.0
To compute the mode over columns and not rows, use the axis parameter:
Returns the original data conformed to a new index with the specified
frequency.
If the index of this DataFrame is a PeriodIndex, the new index
is the result of transforming the original index with
PeriodIndex.asfreq (so the original index
will map one-to-one to the new index).
Otherwise, the new index will be equivalent to pd.date_range(start,end,freq=freq) where start and end are, respectively, the first and
last entries in the original index (see pandas.date_range()). The
values corresponding to any timesteps in the new index which were not present
in the original index will be null (NaN), unless a method for filling
such unknowns is provided (see the method parameter below).
The resample() method is more appropriate if an operation on each group of
timesteps (such as an aggregate) is necessary to represent the data at the new
frequency.
Convenience method for frequency conversion and resampling of time series.
The object must have a datetime-like index (DatetimeIndex, PeriodIndex,
or TimedeltaIndex), or the caller must pass the label of a datetime-like
series/index to the on/level keyword parameter.
The offset string or object representing target conversion.
axis{0 or ‘index’, 1 or ‘columns’}, default 0
Which axis to use for up- or down-sampling. For Series this
will default to 0, i.e. along the rows. Must be
DatetimeIndex, TimedeltaIndex or PeriodIndex.
closed{‘right’, ‘left’}, default None
Which side of bin interval is closed. The default is ‘left’
for all frequency offsets except for ‘M’, ‘A’, ‘Q’, ‘BM’,
‘BA’, ‘BQ’, and ‘W’ which all have a default of ‘right’.
label{‘right’, ‘left’}, default None
Which bin edge label to label bucket with. The default is ‘left’
for all frequency offsets except for ‘M’, ‘A’, ‘Q’, ‘BM’,
‘BA’, ‘BQ’, and ‘W’ which all have a default of ‘right’.
Pass ‘timestamp’ to convert the resulting index to a
DateTimeIndex or ‘period’ to convert it to a PeriodIndex.
By default the input representation is retained.
loffsettimedelta, default None
Adjust the resampled time labels.
Deprecated since version 1.1.0: You should add the loffset to the df.index after the resample.
See below.
baseint, default 0
For frequencies that evenly subdivide 1 day, the “origin” of the
aggregated intervals. For example, for ‘5min’ frequency, base could
range from 0 through 4. Defaults to 0.
Deprecated since version 1.1.0: The new arguments that you should use are ‘offset’ or ‘origin’.
onstr, optional
For a DataFrame, column to use instead of index for resampling.
Column must be datetime-like.
levelstr or int, optional
For a MultiIndex, level (name or number) to use for
resampling. level must be datetime-like.
or str, default ‘start_day’
The timestamp on which to adjust the grouping. The timezone of origin
must match the timezone of the index.
If a timestamp is not used, these values are also supported:
‘epoch’: origin is 1970-01-01
‘start’: origin is the first value of the timeseries
‘start_day’: origin is the first day at midnight of the timeseries
New in version 1.1.0.
‘end’: origin is the last value of the timeseries
‘end_day’: origin is the ceiling midnight of the last day
Series.resample : Resample a Series.
DataFrame.resample : Resample a DataFrame.
groupby : Group DataFrame by mapping, function, label, or list of labels.
asfreq : Reindex a DataFrame with the given frequency without grouping.
Downsample the series into 3 minute bins as above, but label each
bin using the right edge instead of the left. Please note that the
value in the bucket used as the label is not included in the bucket,
which it labels. For example, in the original series the
bucket 2000-01-0100:03:00 contains the value 3, but the summed
value in the resampled bucket with the label 2000-01-0100:03:00
does not include 3 (if it did, the summed value would be 6, not 3).
To include this value close the right side of the bin interval as
illustrated in the example below this one.
In contrast with the start_day, you can use end_day to take the ceiling
midnight of the largest Timestamp as the end of the bins and drop the bins
not containing data:
The result will only be true at a location if all the
labels match. If values is a Series, that’s the index. If
values is a dict, the keys must be the column names,
which must match. If values is a DataFrame,
then both the index and column labels must match.
DataFrame.eq: Equality test for DataFrame.
Series.isin: Equivalent method on Series.
Series.str.contains: Test if pattern or regex is contained within a
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
axis{0 or ‘index’, 1 or ‘columns’, None}, default 0
Indicate which axis or axes should be reduced.
0 / ‘index’ : reduce the index, return a Series whose index is the
original column labels.
1 / ‘columns’ : reduce the columns, return a Series whose index is the
original index.
None : reduce all axes, return a scalar.
bool_onlybool, default None
Include only boolean columns. If None, will attempt to use everything,
then use only boolean data. Not implemented for Series.
skipnabool, default True
Exclude NA/null values. If the entire row/column is NA and skipna is
True, then the result will be True, as for an empty row/column.
If skipna is False, then NA are treated as True, because these are not
equal to zero.
levelint or level name, default None
If the axis is a MultiIndex (hierarchical), count along a
particular level, collapsing into a Series.
axis{0 or ‘index’, 1 or ‘columns’, None}, default 0
Indicate which axis or axes should be reduced.
0 / ‘index’ : reduce the index, return a Series whose index is the
original column labels.
1 / ‘columns’ : reduce the columns, return a Series whose index is the
original index.
None : reduce all axes, return a scalar.
bool_onlybool, default None
Include only boolean columns. If None, will attempt to use everything,
then use only boolean data. Not implemented for Series.
skipnabool, default True
Exclude NA/null values. If the entire row/column is NA and skipna is
True, then the result will be False, as for an empty row/column.
If skipna is False, then NA are treated as True, because these are not
equal to zero.
levelint or level name, default None
If the axis is a MultiIndex (hierarchical), count along a
particular level, collapsing into a Series.
numpy.any : Numpy version of this method.
Series.any : Return whether any element is True.
Series.all : Return whether all elements are True.
DataFrame.any : Return whether any element is True over requested axis.
DataFrame.all : Return whether all elements are True over requested axis.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
DataFrame.eq : Compare DataFrames for equality elementwise.
DataFrame.ne : Compare DataFrames for inequality elementwise.
DataFrame.le : Compare DataFrames for less than inequality
or equality elementwise.
DataFrame.ltCompare DataFrames for strictly less than
inequality elementwise.
DataFrame.geCompare DataFrames for greater than inequality
or equality elementwise.
DataFrame.gtCompare DataFrames for strictly greater than
>>> df_multindex=pd.DataFrame({'cost':[250,150,100,150,300,220],... 'revenue':[100,250,300,200,175,225]},... index=[['Q1','Q1','Q1','Q2','Q2','Q2'],... ['A','B','C','A','B','C']])>>> df_multindex cost revenueQ1 A 250 100 B 150 250 C 100 300Q2 A 150 200 B 300 175 C 220 225
>>> df.le(df_multindex,level=1) cost revenueQ1 A True True B True True C True TrueQ2 A False True B True False C True False
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
DataFrame.eq : Compare DataFrames for equality elementwise.
DataFrame.ne : Compare DataFrames for inequality elementwise.
DataFrame.le : Compare DataFrames for less than inequality
or equality elementwise.
DataFrame.ltCompare DataFrames for strictly less than
inequality elementwise.
DataFrame.geCompare DataFrames for greater than inequality
or equality elementwise.
DataFrame.gtCompare DataFrames for strictly greater than
>>> df_multindex=pd.DataFrame({'cost':[250,150,100,150,300,220],... 'revenue':[100,250,300,200,175,225]},... index=[['Q1','Q1','Q1','Q2','Q2','Q2'],... ['A','B','C','A','B','C']])>>> df_multindex cost revenueQ1 A 250 100 B 150 250 C 100 300Q2 A 150 200 B 300 175 C 220 225
>>> df.le(df_multindex,level=1) cost revenueQ1 A True True B True True C True TrueQ2 A False True B True False C True False
DataFrame.eq : Compare DataFrames for equality elementwise.
DataFrame.ne : Compare DataFrames for inequality elementwise.
DataFrame.le : Compare DataFrames for less than inequality
or equality elementwise.
DataFrame.ltCompare DataFrames for strictly less than
inequality elementwise.
DataFrame.geCompare DataFrames for greater than inequality
or equality elementwise.
DataFrame.gtCompare DataFrames for strictly greater than
>>> df_multindex=pd.DataFrame({'cost':[250,150,100,150,300,220],... 'revenue':[100,250,300,200,175,225]},... index=[['Q1','Q1','Q1','Q2','Q2','Q2'],... ['A','B','C','A','B','C']])>>> df_multindex cost revenueQ1 A 250 100 B 150 250 C 100 300Q2 A 150 200 B 300 175 C 220 225
>>> df.le(df_multindex,level=1) cost revenueQ1 A True True B True True C True TrueQ2 A False True B True False C True False
DataFrame.eq : Compare DataFrames for equality elementwise.
DataFrame.ne : Compare DataFrames for inequality elementwise.
DataFrame.le : Compare DataFrames for less than inequality
or equality elementwise.
DataFrame.ltCompare DataFrames for strictly less than
inequality elementwise.
DataFrame.geCompare DataFrames for greater than inequality
or equality elementwise.
DataFrame.gtCompare DataFrames for strictly greater than
>>> df_multindex=pd.DataFrame({'cost':[250,150,100,150,300,220],... 'revenue':[100,250,300,200,175,225]},... index=[['Q1','Q1','Q1','Q2','Q2','Q2'],... ['A','B','C','A','B','C']])>>> df_multindex cost revenueQ1 A 250 100 B 150 250 C 100 300Q2 A 150 200 B 300 175 C 220 225
>>> df.le(df_multindex,level=1) cost revenueQ1 A True True B True True C True TrueQ2 A False True B True False C True False
DataFrame.eq : Compare DataFrames for equality elementwise.
DataFrame.ne : Compare DataFrames for inequality elementwise.
DataFrame.le : Compare DataFrames for less than inequality
or equality elementwise.
DataFrame.ltCompare DataFrames for strictly less than
inequality elementwise.
DataFrame.geCompare DataFrames for greater than inequality
or equality elementwise.
DataFrame.gtCompare DataFrames for strictly greater than
>>> df_multindex=pd.DataFrame({'cost':[250,150,100,150,300,220],... 'revenue':[100,250,300,200,175,225]},... index=[['Q1','Q1','Q1','Q2','Q2','Q2'],... ['A','B','C','A','B','C']])>>> df_multindex cost revenueQ1 A 250 100 B 150 250 C 100 300Q2 A 150 200 B 300 175 C 220 225
>>> df.le(df_multindex,level=1) cost revenueQ1 A True True B True True C True TrueQ2 A False True B True False C True False
Series.sum : Return the sum.
Series.min : Return the minimum.
Series.max : Return the maximum.
Series.idxmin : Return the index of the minimum.
Series.idxmax : Return the index of the maximum.
DataFrame.sum : Return the sum over the requested axis.
DataFrame.min : Return the minimum over the requested axis.
DataFrame.max : Return the maximum over the requested axis.
DataFrame.idxmin : Return the index of the minimum over the requested axis.
DataFrame.idxmax : Return the index of the maximum over the requested axis.
Series.sum : Return the sum.
Series.min : Return the minimum.
Series.max : Return the maximum.
Series.idxmin : Return the index of the minimum.
Series.idxmax : Return the index of the maximum.
DataFrame.sum : Return the sum over the requested axis.
DataFrame.min : Return the minimum over the requested axis.
DataFrame.max : Return the maximum over the requested axis.
DataFrame.idxmin : Return the index of the minimum over the requested axis.
DataFrame.idxmax : Return the index of the maximum over the requested axis.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
DataFrame.eq : Compare DataFrames for equality elementwise.
DataFrame.ne : Compare DataFrames for inequality elementwise.
DataFrame.le : Compare DataFrames for less than inequality
or equality elementwise.
DataFrame.ltCompare DataFrames for strictly less than
inequality elementwise.
DataFrame.geCompare DataFrames for greater than inequality
or equality elementwise.
DataFrame.gtCompare DataFrames for strictly greater than
>>> df_multindex=pd.DataFrame({'cost':[250,150,100,150,300,220],... 'revenue':[100,250,300,200,175,225]},... index=[['Q1','Q1','Q1','Q2','Q2','Q2'],... ['A','B','C','A','B','C']])>>> df_multindex cost revenueQ1 A 250 100 B 150 250 C 100 300Q2 A 150 200 B 300 175 C 220 225
>>> df.le(df_multindex,level=1) cost revenueQ1 A True True B True True C True TrueQ2 A False True B True False C True False
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Series.sum : Return the sum.
Series.min : Return the minimum.
Series.max : Return the maximum.
Series.idxmin : Return the index of the minimum.
Series.idxmax : Return the index of the maximum.
DataFrame.sum : Return the sum over the requested axis.
DataFrame.min : Return the minimum over the requested axis.
DataFrame.max : Return the maximum over the requested axis.
DataFrame.idxmin : Return the index of the minimum over the requested axis.
DataFrame.idxmax : Return the index of the maximum over the requested axis.
Series.sum : Return the sum.
Series.min : Return the minimum.
Series.max : Return the maximum.
Series.idxmin : Return the index of the minimum.
Series.idxmax : Return the index of the maximum.
DataFrame.sum : Return the sum over the requested axis.
DataFrame.min : Return the minimum over the requested axis.
DataFrame.max : Return the maximum over the requested axis.
DataFrame.idxmin : Return the index of the minimum over the requested axis.
DataFrame.idxmax : Return the index of the maximum over the requested axis.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
Series.sum : Return the sum.
Series.min : Return the minimum.
Series.max : Return the maximum.
Series.idxmin : Return the index of the minimum.
Series.idxmax : Return the index of the maximum.
DataFrame.sum : Return the sum over the requested axis.
DataFrame.min : Return the minimum over the requested axis.
DataFrame.max : Return the maximum over the requested axis.
DataFrame.idxmin : Return the index of the minimum over the requested axis.
DataFrame.idxmax : Return the index of the maximum over the requested axis.
Any single or multiple element data structure, or list-like object.
axis{0 or ‘index’, 1 or ‘columns’}
Whether to compare by the index (0 or ‘index’) or columns
(1 or ‘columns’). For Series input, axis to match Series index on.
levelint or label
Broadcast across a level, matching Index values on the
passed MultiIndex level.
fill_valuefloat or None, default None
Fill existing missing (NaN) values, and any new element needed for
successful DataFrame alignment, with this value before computation.
If data in both corresponding DataFrame locations is missing
the result will be missing.
A histogram is a representation of the distribution of data.
This function calls matplotlib.pyplot.hist(), on each series in
the DataFrame, resulting in one histogram per column.
If passed, will be used to limit data to a subset of columns.
byobject, optional
If passed, then used to form histograms for separate groups.
gridbool, default True
Whether to show axis grid lines.
xlabelsizeint, default None
If specified changes the x-axis label size.
xrotfloat, default None
Rotation of x axis labels. For example, a value of 90 displays the
x labels rotated 90 degrees clockwise.
ylabelsizeint, default None
If specified changes the y-axis label size.
yrotfloat, default None
Rotation of y axis labels. For example, a value of 90 displays the
y labels rotated 90 degrees clockwise.
axMatplotlib axes object, default None
The axes to plot the histogram on.
sharexbool, default True if ax is None else False
In case subplots=True, share x axis and set some x axis labels to
invisible; defaults to True if ax is None otherwise False if an ax
is passed in.
Note that passing in both an ax and sharex=True will alter all x axis
labels for all subplots in a figure.
shareybool, default False
In case subplots=True, share y axis and set some y axis labels to
invisible.
figsizetuple, optional
The size in inches of the figure to create. Uses the value in
matplotlib.rcParams by default.
layouttuple, optional
Tuple of (rows, columns) for the layout of the histograms.
binsint or sequence, default 10
Number of histogram bins to be used. If an integer is given, bins + 1
bin edges are calculated and returned. If bins is a sequence, gives
bin edges, including left edge of first bin and right edge of last
bin. In this case, bins is returned unmodified.
backendstr, default None
Backend to use instead of the backend specified in the option
plotting.backend. For instance, ‘matplotlib’. Alternatively, to
specify the plotting.backend for the whole session, set
pd.options.plotting.backend.
Make a box-and-whisker plot from DataFrame columns, optionally grouped
by some other columns. A box plot is a method for graphically depicting
groups of numerical data through their quartiles.
The box extends from the Q1 to Q3 quartile values of the data,
with a line at the median (Q2). The whiskers extend from the edges
of box to show the range of the data. By default, they extend no more than
1.5 * IQR (IQR = Q3 - Q1) from the edges of the box, ending at the farthest
data point within that interval. Outliers are plotted as separate dots.
For further details see
Wikipedia’s entry for boxplot.
Column name or list of names, or vector.
Can be any valid input to pandas.DataFrame.groupby().
bystr or array-like, optional
Column in the DataFrame to pandas.DataFrame.groupby().
One box-plot will be done per value of columns in by.
axobject of class matplotlib.axes.Axes, optional
The matplotlib axes to be used by boxplot.
fontsizefloat or str
Tick label font size in points or as a string (e.g., large).
rotint or float, default 0
The rotation angle of labels (in degrees)
with respect to the screen coordinate system.
gridbool, default True
Setting this to True will show the grid.
figsizeA tuple (width, height) in inches
The size of the figure to create in matplotlib.
layouttuple (rows, columns), optional
For example, (3, 5) will display the subplots
using 3 columns and 5 rows, starting from the top-left.
return_type{‘axes’, ‘dict’, ‘both’} or None, default ‘axes’
The kind of object to return. The default is axes.
‘axes’ returns the matplotlib axes the boxplot is drawn on.
‘dict’ returns a dictionary whose values are the matplotlib
Lines of the boxplot.
‘both’ returns a namedtuple with the axes and dict.
when grouping with by, a Series mapping columns to
return_type is returned.
If return_type is None, a NumPy array
of axes with the same shape as layout is returned.
backendstr, default None
Backend to use instead of the backend specified in the option
plotting.backend. For instance, ‘matplotlib’. Alternatively, to
specify the plotting.backend for the whole session, set
pd.options.plotting.backend.
The return type depends on the return_type parameter:
‘axes’ : object of class matplotlib.axes.Axes
‘dict’ : dict of matplotlib.lines.Line2D objects
‘both’ : a namedtuple with structure (ax, lines)
For data grouped with by, return a Series of the above or a numpy
array:
Series
array (for return_type=None)
Use return_type='dict' when you want to tweak the appearance
of the lines after plotting. In this case a dict containing the Lines
making up the boxes, caps, fliers, medians, and whiskers is returned.
Boxplots can be created for every column in the dataframe
by df.boxplot() or indicating the columns to be used:
Boxplots of variables distributions grouped by the values of a third
variable can be created using the option by. For instance:
A list of strings (i.e. ['X','Y']) can be passed to boxplot
in order to group the data by combination of the variables in the x-axis:
The layout of boxplot can be adjusted giving a tuple to layout:
Additional formatting can be done to the boxplot, like suppressing the grid
(grid=False), rotating the labels in the x-axis (i.e. rot=45)
or changing the fontsize (i.e. fontsize=15):
The parameter return_type can be used to select the type of element
returned by boxplot. When return_type='axes' is selected,
the matplotlib axes on which the boxplot is drawn are returned:
DataFrame.to_numpy : Recommended alternative to this method.
DataFrame.index : Retrieve the index labels.
DataFrame.columns : Retrieving the column names.
The dtype will be a lower-common-denominator dtype (implicit
upcasting); that is to say if the dtypes (even of numeric types)
are mixed, the one that accommodates all will be chosen. Use this
with care if you are not dealing with the blocks.
e.g. If the dtypes are float16 and float32, dtype will be upcast to
float32. If dtypes are int32 and uint8, dtype will be upcast to
int32. By numpy.find_common_type() convention, mixing int64
and uint64 will result in a float64 dtype.
A DataFrame with mixed type columns(e.g., str/object, int64, float32)
results in an ndarray of the broadest type that accommodates these
mixed types (e.g., object).
Assigns values outside boundary to boundary values. Thresholds
can be singular values or array like, and in the latter case
the clipping is performed element-wise in the specified axis.
Series.clip : Trim values at input threshold in series.
DataFrame.clip : Trim values at input threshold in dataframe.
numpy.clip : Clip (limit) the values in an array.
‘linear’: Ignore the index and treat the values as equally
spaced. This is the only method supported on MultiIndexes.
‘time’: Works on daily and higher resolution data to interpolate
given length of interval.
‘index’, ‘values’: use the actual numerical values of the index.
‘pad’: Fill in NaNs using existing values.
‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘spline’,
‘barycentric’, ‘polynomial’: Passed to
scipy.interpolate.interp1d. These methods use the numerical
values of the index. Both ‘polynomial’ and ‘spline’ require that
you also specify an order (int), e.g.
df.interpolate(method='polynomial',order=5).
‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’, ‘akima’,
‘cubicspline’: Wrappers around the SciPy interpolation methods of
similar names. See Notes.
‘from_derivatives’: Refers to
scipy.interpolate.BPoly.from_derivatives which
replaces ‘piecewise_polynomial’ interpolation method in
scipy 0.18.
axis{{0 or ‘index’, 1 or ‘columns’, None}}, default None
Axis to interpolate along.
limitint, optional
Maximum number of consecutive NaNs to fill. Must be greater than
0.
Consecutive NaNs will be filled in this direction.
If limit is specified:
If ‘method’ is ‘pad’ or ‘ffill’, ‘limit_direction’ must be ‘forward’.
If ‘method’ is ‘backfill’ or ‘bfill’, ‘limit_direction’ must be
‘backwards’.
If ‘limit’ is not specified:
If ‘method’ is ‘backfill’ or ‘bfill’, the default is ‘backward’
else the default is ‘forward’
Changed in version 1.1.0: raises ValueError if limit_direction is ‘forward’ or ‘both’ and
method is ‘backfill’ or ‘bfill’.
raises ValueError if limit_direction is ‘backward’ or ‘both’ and
method is ‘pad’ or ‘ffill’.
The ‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’ and ‘akima’
methods are wrappers around the respective SciPy implementations of
similar names. These use the actual numerical values of the index.
For more information on their behavior, see the
SciPy documentation
and SciPy tutorial.
Filling in NaN in a Series via polynomial interpolation or splines:
Both ‘polynomial’ and ‘spline’ methods require that you also specify
an order (int).
Fill the DataFrame forward (that is, going down) along each column
using linear interpolation.
Note how the last entry in column ‘a’ is interpolated differently,
because there is no entry after it to use for interpolation.
Note how the first entry in column ‘b’ remains NaN, because there
is no entry before it to use for interpolation.
>>> df=pd.DataFrame([(0.0,np.nan,-1.0,1.0),... (np.nan,2.0,np.nan,np.nan),... (2.0,3.0,np.nan,9.0),... (np.nan,4.0,-4.0,16.0)],... columns=list('abcd'))>>> df a b c d0 0.0 NaN -1.0 1.01 NaN 2.0 NaN NaN2 2.0 3.0 NaN 9.03 NaN 4.0 -4.0 16.0>>> df.interpolate(method='linear',limit_direction='forward',axis=0) a b c d0 0.0 NaN -1.0 1.01 1.0 2.0 -2.0 5.02 2.0 3.0 -3.0 9.03 2.0 4.0 -4.0 16.0
condbool Series/DataFrame, array-like, or callable
Where cond is True, keep the original value. Where
False, replace with corresponding value from other.
If cond is callable, it is computed on the Series/DataFrame and
should return boolean Series/DataFrame or array. The callable must
not change input Series/DataFrame (though pandas doesn’t check it).
otherscalar, Series/DataFrame, or callable
Entries where cond is False are replaced with
corresponding value from other.
If other is callable, it is computed on the Series/DataFrame and
should return scalar or Series/DataFrame. The callable must not
change input Series/DataFrame (though pandas doesn’t check it).
inplacebool, default False
Whether to perform the operation in place on the data.
axisint, default None
Alignment axis if needed.
levelint, default None
Alignment level if needed.
errorsstr, {‘raise’, ‘ignore’}, default ‘raise’
Note that currently this parameter won’t affect
the results and will always coerce to a suitable dtype.
‘raise’ : allow exceptions to be raised.
‘ignore’ : suppress exceptions. On error return original object.
try_castbool, default None
Try to cast the result back to the input type (if possible).
Deprecated since version 1.3.0: Manually cast back if necessary.
The where method is an application of the if-then idiom. For each
element in the calling DataFrame, if cond is True the
element is used; otherwise the corresponding element from the DataFrame
other is used.
The signature for DataFrame.where() differs from
numpy.where(). Roughly df1.where(m,df2) is equivalent to
np.where(m,df1,df2).
For further details and examples see the where documentation in
indexing.
condbool Series/DataFrame, array-like, or callable
Where cond is False, keep the original value. Where
True, replace with corresponding value from other.
If cond is callable, it is computed on the Series/DataFrame and
should return boolean Series/DataFrame or array. The callable must
not change input Series/DataFrame (though pandas doesn’t check it).
otherscalar, Series/DataFrame, or callable
Entries where cond is True are replaced with
corresponding value from other.
If other is callable, it is computed on the Series/DataFrame and
should return scalar or Series/DataFrame. The callable must not
change input Series/DataFrame (though pandas doesn’t check it).
inplacebool, default False
Whether to perform the operation in place on the data.
axisint, default None
Alignment axis if needed.
levelint, default None
Alignment level if needed.
errorsstr, {‘raise’, ‘ignore’}, default ‘raise’
Note that currently this parameter won’t affect
the results and will always coerce to a suitable dtype.
‘raise’ : allow exceptions to be raised.
‘ignore’ : suppress exceptions. On error return original object.
try_castbool, default None
Try to cast the result back to the input type (if possible).
Deprecated since version 1.3.0: Manually cast back if necessary.
The mask method is an application of the if-then idiom. For each
element in the calling DataFrame, if cond is False the
element is used; otherwise the corresponding element from the DataFrame
other is used.
The signature for DataFrame.where() differs from
numpy.where(). Roughly df1.where(m,df2) is equivalent to
np.where(m,df1,df2).
For further details and examples see the mask documentation in
indexing.
This method adds the directory (possibly a specific subfolder) to the sys.path (i.e. PYTHONPATH)
so that the python functions contained in this folder can be called from within OMFIT
Attempts to look up a list of available EFITs using various sources
Parameters:
scratch_area – dict
Scratch area for storing results to reduce repeat calls.
device – str
Device name
shot – int
Shot number
allow_rdb – bool
Allow connection to DIII-D RDB to gather EFIT information (only applicable for select devices)
(First choice for supported devices)
allow_mds – bool
Allow connection to MDSplus to gather EFIT information (only applicable to select devices)
(First choice for non-RDB devices, second choice for devices that normally support RDB)
allow_guess – bool
Allow guesses based on common patterns of EFIT availability on specific devices
(Last resort, only if other options fail)
**kw –
Keywords passed to specific functions. Can include:
param default_snap_list:
dict [optional]
Default set of EFIT treenames. Newly discovered ones will be added to the list.
param format:
str
Instructions for formatting data to make the EFIT tag name.
Provided for compatibility with available_efits_from_rdb() because the only option is ‘{tree}’.
Returns:
(dict, str)
Dictionary keys will be descriptions of the EFITs
Dictionary values will be the formatted identifiers.
If lookup fails, the dictionary will be {‘’: ‘’} or will only contain default results, if any.
String will contain information about the discovered EFITs
Copy FC, OH, VESSEL from passed object into current object,
without changing the number of elements in current object.
This requires that the number of elements in the current object
is greater or equal than the number of elements in the passed object.
The extra elements in the current object will be placed at R=0, Z=0
Copy flux loops and magnetic probes from other object into current object,
without changing the number of elements in current object
This requires that the number of elements in the current object
is greater or equal than the number of elements in the passed object.
The extra elements in the current object will be placed at R=0, Z=0
The routine converts ods outline format to efund data format
Since efund only supports parallelograms and requires 2 sides
to be either vertical or horizontal this will likely not match
the outline very well. Instead, the parallelogram will only
match the angle of the lower left side, the height of the upper
right side, and width of the the left most top side.
WARNING: only rudimentary identifies are assigned for pf_active
You should assign your own identifiers and only rely on this function to assign numerical geometry data.
Parameters:
ods – ODS instance
Data will be added in-place
update – systems to populate
[‘oh’, ‘pf_active’, ‘flux_loop’, ‘b_field_pol_probe’]
[‘magnetics’] will enable both [‘flux_loop’, ‘b_field_pol_probe’]
NOTE that in IMAS the OH information goes under pf_active
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
lazyLoad – enable/disable lazy load of picklefiles and xarrays
OMFIT class used to interface to equilibria files generated by ELITE and BALOO (.dskbal files)
NOTE: this object is “READ ONLY”, meaning that the changes to the entries of this object will not be saved to a file. Method .save() could be written if becomes necessary
Parameters:
filename – filename passed to OMFITobject class
**kw – keyword dictionary passed to OMFITobject class
OMFIT class used to interface to balstab file from BALOO
NOTE: this object is “READ ONLY”, meaning that the changes to the entries of this object will not be saved to a file. Method .save() could be written if becomes necessary
Parameters:
filename – filename passed to OMFITobject class
**kw – keyword dictionary passed to OMFITobject class
OMFIT class used to interface to outbal file from BALOO
NOTE: this object is “READ ONLY”, meaning that the changes to the entries of this object will not be saved to a file. Method .save() could be written if becomes necessary
Parameters:
filename – filename passed to OMFITobject class
**kw – keyword dictionary passed to OMFITobject class
Contains classes and functions to perform ELM detection and ELM filtering:
- OMFITelm class; main workhorse for ELM detection
- Some smoothing functions used by OMFITelm
The regression test for OMFITelm is in regression/test_OMFITelm.py
omfit_classes.omfit_elm.asym_gauss_smooth(x, y, s, lag, leading_side_width_factor)[source]¶
This is a smoothing function with a Gaussian kernel that does not require evenly spaced data and allows the
Gaussian center to be shifted.
Parameters:
x – array
Dependent variable
y – array
Independent variable
s – float
Sigma of tailing side (same units as x)
lag – float
Positive values shift the Gaussian kernel back in time to increase weight in the past:
makes the result laggier. (same units as x)
leading_side_width_factor – float
The leading side sigma will be this much bigger than the tailing side. Values > 1 increase the weight on
data from the past, making the signal laggier. (unitless)
Quickly detect ELMs and run a filter that will tell you which time-slices are okay and which should be rejected
based on user specifications for ELM phase, etc.
Parameters:
device – string
Name of tokamak or MDS server
shot – int
Shot number
detection_settings –
dict or string.
If this is ‘default’ or {}, then default settings will be used. To change some of the settings from their
machine-specific defaults, set to a dictionary that can contain:
default_filterscope_for_elms: string or list of strings giving filterscope(s) to use. Will be overridden
by use_filterscopes keyword, if provided.
smoother: string giving name of smoothing function. Pick from:
gauss_smooth: Gaussian smoother
rc_smooth: RC lowpass filter, specifically implemented to mimic approximations used in the DIII-D PCS.
asym_gauss_smooth: Asymmetric Gaussian smoother with different sigma for past and future.
fft_smooth: lowpass via FFT, cut out part of spectrum, inverse FFT
time_window: two element list of numbers giving ends of the time range to consider in ms
time_factor: float: factor to multiply into MDSplus times to get timebase in ms
jump_hold_max_dt: float Sets maximum dt between input data samples for the Jump&Hold method.
Used to exclude slow data from old shots, which don’t work.
allow_fallback_when_dt_large: bool. If a basic test of data compatibility with the chosen method
fails, a new method may be chosen if this flag is set.
hold_elm_flag_until_low_dalpha: flag for turning on extra logic to hold the during-ELM state until
D_alpha drops to close to pre-ELM levels.
hold_elm_low_dalpha_threshold: float: Threshold as a fraction between pre-ELM min & during-ELM max
hold_elm_min_finding_time_window: float: Interval before ELM start to search for pre-ELM minimum (ms)
hold_elm_timeout’: float: Maximum time the ELM flag hold can persist (ms). The ELM can still be longer
than this as determined by other rules, but the hold flag cannot extend the ELM past this total duration.
detection_method: int: 0 = classic edge detection style. 1 = new jump and hold strategy.
find_jump_time_window: float: Length of time window used to find nearby maximum in method 1 (ms)
find_jump_threshold: float: Threshold for normalized jump size in method 1 (unitless)
****_tuning: dict where **** is the name of a smoothing function. Within this dict can be:
mild_smooth: (ms) Smoothing timescale for mild smooth
heavy_smooth_factor: Heavy will be this much stronger than mild
mild_smooth_derivative_factor: The deriv is of already smoothed data, but it may need a bit more.
heavy_smooth_derivative_factor: The derivative is smoothed again harder so there can be a diff.
threshold_on_difference_of_smooths: When mild-heavy is greater than this threshold, it must be
during ELM (relative to max diff)
threshold_on_difference_of_smoothed_derivatives_plus: When mild(der)-heavy(der) is greater than
this threshold, it must be on one of the edges of an ELM.
threshold_on_difference_of_smoothed_derivatives_minus: Same as previous, but heavy(der)-mild(der)
instead.
d_thresh_enhance_factor: When the diff of derivs is positive but the less smoothed derivative is
still negative, we’re in danger of getting a false positive, so make the threshold higher.
neg_d_thresh_enhance_factor: Same as before but for when mild(der)-heavy(der) is negative instead.
debounce: Number of samples in debouncing window
leading_side_width_factor: [asym_gauss_smooth ONLY]: how big is the asymmetry?
gauss_center_lag: [asym_gauss_smooth ONLY]: Shift center back in time, making the thing laggier.
Negative value shifts forward in time, putting more weight on the future.
filter_settings –
dict or string
If this is ‘default’ or {}, then default settings will be used. To change some of the settings from their
machine-specific defaults, set to a dictionary that can contain:
elm_phase_range: A two element list or array containing the minimum and maximum acceptable values of
ELM phase. ELM phase starts at 0 when an ELM ends and goes to 1 just before the next ELM begins,
then flips to -1 and is negative during ELMs. ELM phase increases from -1 to 0 during an ELM before
becoming positive again in the next inter-ELM period.
elm_since_reject_below: Data are rejected if the time since the last ELM is below this threshold.
Useful for packs of closely spaced ELMs; a double spike in D-alpha could register as two ELMs with a
very short inter-ELM period between them when phase would increase from 0 to 1 and a slice could be
accepted, even if there hadn’t really been any time for ELM recovery.
This setting is ignored if its value is <= -10 or if either end of the elm_phase_range is < 0.
elm_since_accept_above: Data are accepted if the time since the last ELM is above this threshold,
regardless of ELM phase. Useful for analyzing shots that have a mixture of ELMing and non-ELMing
phases. An ELM free period will normally be counted as one long inter-ELM period and it will take a
long time for ELM phase to increase, which could lead to rejection of too many data. This setting
overrides the ELM phase test to allow ELM-free periods to be included.
CER_entire_window_must_pass_ELM_filter: Relevant for CER where stime>0. If this is True, then the
entire averaging window must be in the “good” part of the ELM phase. If this is False, only the
middle has to be in the good part and also no ELMs are allowed to start during the time window in
either case.
use_filterscopes – None, False, or input satisfying detection_settings -> default_filterscope_for_elms
Easier-to-access override for default_filterscope_for_elms in detection_settings
attempt_sum_tdi_filterscopoes – bool
Try to ask the server to sum D_alpha signals so that they don’t have to be interpolated and summed client
side, and so that there will only be one signal transferred. Works sometimes, but not yet entirely reliable.
It also doesn’t have an obvious speed benefit.
debug – bool
Turn on spammier debug print or printd statements and also save intermediate data
debug_plots – bool
Turn on debugging plots
on_failure – string
What action should be done for failures?
‘exception’: raise an exception
‘pass’: return all ones from filter(); passing all data through the ELM filter (same as no ELM filtering)
mode – string
‘elm’: ELM detection (what this class was always meant to do)
‘sawtooth’: Sawtooth detection
quiet – bool
Suppress some warnings and error messages which would often be useful.
Tries to guess the time factor needed to convert to ms
Parameters:
mdsvalue – OMFITmdsValue instance [optional]
Used to obtain mds_time_units, if provided
mds_time_units –
string [optional]
This string will be compared to common ways of representing various
time units. If this is not provided and cannot be obtained, the guess
will be based on device only.
This setting is ignored if:
mdsvalue is provided
The device is one which is known to frequently have incorrect
units logged in MDSplus.
device – string [optinal]
Defaults to self.device. This shouldn’t have to be supplied
except for testing or exotic applications.
Returns:
float
Factor needed to convert from time units to ms.
Returns None if it cannot be determined.
Automatically chooses signals to use for sawtooth detection.
First: try to get within rho < rho_close_to_axis.
Second: avoid cut-off.
If there are no good channels (no cut-off) within rho <= rho_close_to_axis,
get the closest channel in that’s not cut-off, but don’t look at rho >= rho_far_from_axis for it.
For cut-off, use average density to estimate local density, based on the assumption that anything farther
out than the top of the pedestal can be ignored anyway, and that density has a small gradient in the core. This
might be okay.
There is no relativistic correction for frequency, which could lead to a small error in position which is
deemed to be irrelevant for this application.
Parameters:
rho_close_to_axis – float between 0 and 1
The first try is to find non-cut-off channels with rho <= rho_close_to_axis
rho_far_from_axis – float between 0 and 1
If no channels with rho <= rho_close_to_axis, find the closest non-cut-off channel, but only use it if
its rho < rho_far_from_axis
Returns:
list of strings
Pointnames for good ECE channels to use in sawtooth detection
New detection algorithm. Focuses on finding just the start of each ELM (when D_alpha jumps up) at first,
then forcing the ELM state to be held until D_alpha drops back down again.
Uses a Difference of Smooths (generalized from difference of gaussians edge detection) scheme to find the edges
of D_alpha bursts. It also runs a DoS scheme on the derivative of D_alpha w.r.t. time.
This allows time points (in the D_alpha series) to be flagged as soon as the ELM starts: it doesn’t just get
the tip, but instead it takes the whole thing.
Parameters:
calc_elm_freq – bool
Run ELM frequency calculation
report_profiling – bool
Reports time taken to reach / time taken between checkpoints in the code.
Used to identify bottlenecks and support optimization.
Finish detection by compressing the ELM flag to 4 points per ELM and calculating quantities like ELM phase
Separate from .detect() to make room for different detection algorithms which will all finish the same way.
Could also be used for convenience with an externally calculated ELM flag by passing in your own ELM detection
results (and using no_save=True to avoid disturbing results calculated by the class, if you like).
Parameters:
x – float array
Time base in ms
elm_flag2 – int array matching length of x
Flag indicating when ELMs are happening (1) or not (0)
no_save – bool
Disable saving of results; return them in a dictionary instead
shot – [optional]
Only used in announcements. If None, self.shot will be used. If you are passing in some other stuff and
don’t want inconsistent printd statements, you can pass in a different thing for shot.
It would traditionally be an int, but it could be a string or something since it only gets printed here.
Returns:
None or dict
Returns results in a dict if no_save is set. Otherwise, results are just saved into the class and nothing is
returned.
method – None or int
None: use value in settings
int: manual override: ID number of method you want to use
- 0: very simple: 1 / local period
- 1: simple w/ smooth
- 2: process period then invert
- 3: method 0 then interpolate and smooth
Use ELM detector & ELM filtering settings to determine whether each element in an array of times is good or bad.
Parameters:
times_to_filter – numeric 1d array or a list of such arrays
For most signals (like Thomson scattering data): a single 1D array of times in ms. For CER: a list of
1D arrays, one for each CER channel.
cer_mode – bool
Assume times_to_filter is a list of lists/arrays, one entry per channel (loop through elements of
times, each one of which had better itself have several elements)
stimes – float -OR- array or list of arrays matching dimensions of times_to_filter
Averaging time window for CER. This typically set to 2.5 ms, but it can vary (read it from your data if
possible)
apply_elm_filter – bool
Debugging option: if this is set False, the script will go through its pre-checks and setup without actually
doing any ELM-filtering.
debug – bool [optional]
Override class debug flag for just the filtering run. Leave at None to use self.debug instead of overriding.
on_failure – str [optional]
Override class on_failure flag. Sets behavior when filtering is not possible.
‘pass’: Pass all data (elm_okay = 1 for all)
‘exception’: raise an exception. (default for unrecognized)
Returns:
array of bools or list of arrays matching shape of times_to_filter
Flag indicating whether each sample in times_to_filter is okay according to ELM filtering rules
time_range – two element iterable with numbers
The plot will initially zoom to this time range (ms).
time_range_for_mean – two element numeric, True, or False/None
- True: time_range will be used to define the interval, then the mean ELM frequency will be calculated
- False or None: no mean ELM frequency calculation
- Two element numeric: mean ELM frequency will be calculated in this interval; can differ from time_range.
overlay – bool
Overlay on existing figure instead of making a new one
fig – matplotlib Figure instance
Figure to use when overlaying.
axs – List of at least two matplotlib Axes instances
Axes to use when overlaying
quiet – bool
Convert all print to printd
plotkw – Additional keywords are passed to pyplot.plot()
Plots the signal used for detection and shades or recolors intervals where events are detected
:param ax: Axes instance
:param wt: array for masking
:param shade_elms: bool
:param standalone: bool
Plots details related to ELM/sawtooth/whatever timing, like time since the last event
:param ax: Axes instance
:param crop_elm_since: bool
:param standalone: bool
Plot explanation of the hold correction to the ELM flag
Only works if detection was done with debug = True
:param ax: Axes instance
:param wt: array of bools
Plots important variables related to jump & hold ELM detection
This both demonstrates how the ELM detector works and serves as a diagnostic plot that can help with tuning.
Parameters:
time_zoom – two element iterable containing numbers
Zoom in to this time range in ms
If None, auto selects the default for the current device
crop_data_to_zoom_range – bool
Crop data to range given by time_zoom before calling plot. This can prevent resampling, which will make the
plot better.
crop_elm_since – float or bool
float = plot max in ms. True = auto-scale sensibly. False = auto-scale stupidly.
show_phase – bool
Plot ELM phase in an additional subplot. Phase is 0 at the end of an ELM, then increases during the
inter-ELM period until it reaches +1 at the start of the next ELM.
Then it jumps to -1 and increases back to 0 during the ELM.
show_more_timing – bool
Include an additional subplot showing more timing details like time since last ELM & local ELM period length
show_details – bool
Shows details of how ELM detection works (individual terms like DS, DD, etc.) in a set of add’l subplots.
hide_y – bool
Turns off numbers on the y-axis and in some legend entries. Useful if you want to consider D_alpha to be in
arbitrary units and don’t want to be distracted by numbers.
hide_numbers – bool
Hides numerical values of smoothing time scales and other settings used in ELM detection.
Useful if you want shorter legend entries or a simpler looking plot.
legend_outside – bool
Place the legends outside of the plots; useful if you have the plots positioned so there is empty space to
the right of them.
notitles – bool
Suppress titles on the subplots.
shade_elms – bool
On the ELM ID plot, shade between 0 and D_alpha
figsize – Two element iterable containing numbers
(X, Y) Figure size in cm
fig – Figure instance
Used for overlay
axs – List of Axes instances
Used for overlay. Must be at least long enough to accommodate the number of plots ordered.
Plots important variables related to classic ELM detection
This both demonstrates how the ELM detector works and serves as a diagnostic plot that can help with tuning.
Parameters:
time_zoom – two element iterable containing numbers
Zoom in to this time range in ms
If None, auto selects the default for the current device
crop_data_to_zoom_range – bool
Crop data to range given by time_zoom before calling plot. This can prevent resampling, which will make the
plot better.
crop_elm_since – float or bool
float = plot max in ms. True = auto-scale sensibly. False = auto-scale stupidly.
show_phase – bool
Plot ELM phase in an additional subplot. Phase is 0 at the end of an ELM, then increases during the
inter-ELM period until it reaches +1 at the start of the next ELM.
Then it jumps to -1 and increases back to 0 during the ELM.
show_more_timing – bool
Include an additional subplot showing more timing details like time since last ELM & local ELM period length
show_details – bool
Shows details of how ELM detection works (individual terms like DS, DD, etc.) in a set of add’l subplots.
hide_y – bool
Turns off numbers on the y-axis and in some legend entries. Useful if you want to consider D_alpha to be in
arbitrary units and don’t want to be distracted by numbers.
hide_numbers – bool
Hides numerical values of smoothing time scales and other settings used in ELM detection.
Useful if you want shorter legend entries or a simpler looking plot.
legend_outside – bool
Place the legends outside of the plots; useful if you have the plots positioned so there is empty space to
the right of them.
notitles – bool
Suppress titles on the subplots.
shade_elms – bool
On the ELM ID plot, shade between 0 and D_alpha
figsize – Two element iterable containing numbers
(X, Y) Figure size in cm
fig – Figure instance
Used for overlay
axs – List of Axes instances
Used for overlay. Must be at least long enough to accommodate the number of plots ordered.
Read basic equilibrium data from MDSplus
This is a lightweight function for reading simple data from all EFIT slices at once without making g-files.
Parameters:
device – str
The tokamak that the data correspond to (‘DIII-D’, ‘NSTX’, etc.)
server – str [Optional, special purpose]
MDSplus server to draw data from. Use this if you are connecting to a
server that is not recognized by the tokamak() command, like vidar,
EAST_US, etc. If this is None, it will be copied from device.
shot – int
Shot number from which to read data
tree – str
Name of the MDSplus tree to connect to, like ‘EFIT01’, ‘EFIT02’, ‘EFIT03’, …
g_file_quantities – list of strings
Quantities to read from the sub-tree corresponding with the EFIT g-file.
Example: [‘r’, ‘z’, ‘rhovn’]
a_file_quantities – list of strings
Quantities to read from the sub-tree corresponding with the EFIT a-file.
Example: [‘area’]
measurements – list of strings
Quantities to read from the MEASUREMENTS tree.
Example: [‘fccurt’]
derived_quantities – list of strings
Derived quantities to be calculated and returned.
This script understands a limited set of simple calculations: ‘time’, ‘psin’, ‘psin1d’
Example: [‘psin’, ‘psin1d’, ‘time’]
other_results – list of strings
Other quantities to be gathered from the parent tree that holds gEQDSK and aEQDSK.
Example: [‘DATE_RUN’]
quiet – bool
get_all_meas – bool
Fetch measurement signals according to its time basis which includes extra time slices that failed to fit.
The time ‘axis’ will be avaliabe in [‘mtimes’]
toksearch_mds – OMFITtoksearch instance
An already fetched and loaded OMFITtoksearch object, expected to have
fetched all of the signals for the mdsValues in this file.
allow_shot_tree_translation – bool
Allow the real shot and tree to be translated to the fake shot stored in the EFIT tree
device – string
Name of the tokamak or MDSserver from whence cometh the data.
shot – int
Shot for which data are to be gathered.
times – numeric iterable
Time slices to gather in ms, even if working with an MDS server that normally operates in seconds.
exact – bool
Fail instead of interpolating if the exact time-slices are not available.
snap_file – string
Description of which EFIT tree to gather from.
time_diff_warning_threshold – float
Issue a warning if the difference between a requested time slice and the closest time slice in the source EFIT
exceeds this threshold.
fail_if_out_of_range – bool
Skip requested times that fail the above threshold test.
get_afile – bool
gather A-file quantities as well as G-file quantities.
get_mfile – bool
gather M-file quantities as well as G-file quantities.
show_missing_data_warnings –
bool
1 or True: Print a warning for each missing item when setting it to a default value.
May not be necessary because some things in the a-file don’t seem very important
and are always missing from MDSplus.
2 or “once”: print warning messages for missing items if the message would be unique.
Don’t repeat warnings about the same missing quanitty for subsequent time-slices.
0 or False: printd instead (use quiet if you really don’t want it to print anything)
None: select based on device. Most devices should default to ‘once’.
debug – bool
Save intermediate results to the tree for inspection.
quiet – bool
close – bool
Close each file at each time before going on to the next time
Returns:
a dictionary containing a set of G-files in another dictioanry named ‘gEQDSK’, and, optionally, a set of
A-files under ‘aEQDSK’ and M-filess under ‘mEQDSK’
Make a quick and dirty estimate for x-point position to guide higher quality estimation
The goal is to identify the primary x-point to within a grid cell or so
Parameters:
rgrid – 1d float array
R coordinates of the grid
zgrid – 1d float array
Z coordinates of the grid
psigrid – 2d float array
psi values corresponding to rgrid and zgrid
psi_boundary – float [optional]
psi value on the boundary; helps distinguish the primary x-point from other field nulls
If this is not provided, you may get the wrong x-point.
psi_boundary_weight – float
Sets the relative weight of matching psi_boundary compared to minimizing B_pol.
1 gives ~equal weight after normalizing Delta psi by grid spacing and r (to make it comparable to B_pol in
the first place)
10 gives higher weight to psi_boundary, which might be nice if you keep locking onto the secondary x-point.
Actually, it seems like the outcome isn’t very sensitive to this weight. psi_boundary is an adequate tie
breaker between two B_pol nulls with weights as low as 1e-3 for some cases, and it’s not strong enough to move
the quick estiamte to a different grid cell on a 65x65 with weights as high as 1e2. Even then, the result is
still close enough to the True X-point that the higher quality algorithm can find the same answer. So, just
leave this at 1.
zsign – int
If you know the X-point you want is on the top or the bottom, you can pass in 1 or -1 to exclude
the wrong half of the grid.
Returns:
two element float array
Low quality estimate for the X-point R,Z coordinates with units matching rgrid
Returns the native COCOS that an unmodified gEQDSK would obey, defined by sign(Bt) and sign(Ip)
In order for psi to increase from axis to edge and for q to be positive:
All use sigma_RpZ=+1 (phi is counterclockwise) and exp_Bp=0 (psi is flux/2.*pi)
We want
sign(psi_edge-psi_axis) = sign(Ip)*sigma_Bp > 0 (psi always increases in gEQDSK)
sign(q) = sign(Ip)*sign(Bt)*sigma_rhotp > 0 (q always positive in gEQDSK)
Automatically determine the type of an EFIT file and parse it with the appropriate class.
It is faster to just directly use the appropriate class. Using the right class also avoids problems because some
files technically can be parsed with more than one class (no exceptions thrown), giving junk results.
Parameters:
filename – string
Name of the file on disk, including path
EFITtype – string
Letter giving the type of EFIT file, like ‘g’. Should be in ‘gamks’.
If None, then the first letter in the filename is used to determine the file type
If this is also not helping, then a brute-force load is attempted
strict – bool
Filename (not including path) must include the letter giving the file type.
Prevents errors like using sEQDSK to parse g133221.01000, which might otherwise be possible.
**kw – Other keywords to pass to the class that is chosen.
device – The tokamak that the data correspond to (‘DIII-D’, ‘NSTX’, etc.)
shot – Shot number from which to read data
time – time slice from which to read data
exact – get data from the exact time-slice
SNAPfile – A string containing the name of the MDSplus tree to connect to, like ‘EFIT01’, ‘EFIT02’, ‘EFIT03’, …
time_diff_warning_threshold – raise error/warning if closest time slice is beyond this treshold
fail_if_out_of_range – Raise error or warn if closest time slice is beyond time_diff_warning_threshold
show_missing_data_warnings – Print warnings for missing data
1 or True: display with printw
2 or ‘once’: only print the first time
0 or False: display all but with printd instead of printw
None: select based on device. Most will chose ‘once’.
Returns the native COCOS that an unmodified gEQDSK would obey, defined by sign(Bt) and sign(Ip)
In order for psi to increase from axis to edge and for q to be positive:
All use sigma_RpZ=+1 (phi is counterclockwise) and exp_Bp=0 (psi is flux/2.*pi)
We want
sign(psi_edge-psi_axis) = sign(Ip)*sigma_Bp > 0 (psi always increases in gEQDSK)
sign(q) = sign(Ip)*sign(Bt)*sigma_rhotp > 0 (q always positive in gEQDSK)
Method used to linearly combine current equilibrium (eq1) with other g-file
All quantities are linearly combined, except ‘RBBBS’,’ZBBBS’,’NBBBS’,’LIMITR’,’RLIM’,’ZLIM’,’NW’,’NH’
OMFIT[‘eq3’]=OMFIT[‘eq1’].combineGEQDSK(OMFIT[‘eq2’],alpha)
means:
eq3=alpha*eq1+(1-alpha)*eq2
Reconstructs Fourier decomposition of the boundary for fixed boundary codes to use
Parameters:
surface – Use this normalised flux surface for the boundary (if <0 then original gEQDSK BBBS boundary is used), else the flux surfaces are from FluxSurfaces.
nf – number of Fourier modes
symmetric – return symmetric boundary
resolution – FluxSurfaces resolution factor
**kw – additional keyword arguments are passed to FluxSurfaces.findSurfaces
Function used to plot g-files. This plot shows flux surfaces in the vessel, pressure, q profiles, P’ and FF’
Parameters:
usePsi – In the plots, use psi instead of rho, or both
only1D – only make plofile plots
only2D – only make flux surface plot
top2D – Plot top-view 2D cross section
q_contour_n – If above 0, plot q contours in 2D plot corresponding to rational surfaces of the given n
label_contours – Adds labels to 2D contours
levels – list of sorted numeric values to pass to 2D plot as contour levels
mask_vessel – mask contours with vessel
show_limiter – Plot the limiter outline in (R,Z) 2D plots
xlabel_in_legend – Show x coordinate in legend instead of under axes (usefull for overplots with psi and rho)
label – plot item label to apply lines in 1D plots (only the q plot has legend called by the geqdsk class
itself) and to the boundary contour in the 2D plot (this plot doesn’t call legend by itself)
ax – Axes instance to plot in when using only2D
**kw – Standard plot keywords (e.g. color, linewidth) will be passed to Axes.plot() calls.
max_lim – If max_lim is specified and the number of limiter points
- before downsampling is smaller than max_lim, then no downsampling is performed
after downsampling is larger than max_lim, then an error is raised
max_bnd – If max_bnd is specified and the number of boundary points
- before downsampling is smaller than max_bnd, then no downsampling is performed
- after downsampling is larger than max_bnd, then an error is raised
device – The tokamak that the data correspond to (‘DIII-D’, ‘NSTX’, etc.)
shot – Shot number from which to read data
time – time slice from which to read data
exact – get data from the exact time-slice
SNAPfile – A string containing the name of the MDSplus tree to connect to, like ‘EFIT01’, ‘EFIT02’, ‘EFIT03’, …
time_diff_warning_threshold – raise error/warning if closest time slice is beyond this treshold
fail_if_out_of_range – Raise error or warn if closest time slice is beyond time_diff_warning_threshold
show_missing_data_warnings – Print warnings for missing data
1 or True: yes, print the warnings
2 or ‘once’: print only unique warnings; no repeats for the same quantities missing from many time slices
0 or False: printd instead of printw
None: select based on device. Most will chose ‘once’.
Fill in gEQDSK data from aug_sfutils, which processes magnetic equilibrium
results from the AUG CLISTE code.
Note that this function requires aug_sfutils to be locally installed
(pip install aug_sfutils will do). Users also need to have access to the
AUG shotfile system.
Parameters:
shot – AUG shot number from which to read data
time – time slice from which to read data
eq_shotfile – equilibrium reconstruction to fetch (EQI, EQH, IDE, …)
ed – edition of the equilibrium reconstruction shotfile
Searches through all the groups in the k-file namelist (IN1,
INS,EFITIN, etc.) and deletes duplicated variables. You can keep
the first instance or the last instance.
Parameters:
keep_first_or_last – string (‘first’ or ‘last’)
- If there are duplicates, only one can be kept. Should it be the first one or the last one?
update_original – bool
Set False to leave the original untouched during testing. Use with make_new_copy.
make_new_copy – bool
Create a copy of the OMFITkeqdsk instance and return it. Useful if the original is not being modified.
Returns:
None or OMFITkeqdsk instance (depending on make_new_copy)
code_parameters. In the future, parameters including ITIME,PLASMA,EXPMP2,COILS,BTOR,DENR,DENV,
SIREF,BRSP,ECURRT,VLOOP,DFLUX,SIGDLC,CURRC79,CURRC139,CURRC199,CURRIU30,CURRIU90,CURRIU150,
CURRIL30,CURRIL90, CURRIL150 should be specified from ods raw parameters.
Parameters:
ods – input ods from which data is added
time_index – time index from which data is added to ods
time – time in seconds where to compare the data (if set it superseeds time_index)
Currently this fuction just reads code_parameters.
In the future, parameters including ITIME,PLASMA,EXPMP2,COILS,BTOR,DENR,DENV,
SIREF,BRSP,ECURRT,VLOOP,DFLUX,SIGDLC,CURRC79,CURRC139,CURRC199,CURRIU30,CURRIU90,CURRIU150,
CURRIL30,CURRIL90, CURRIL150 should be written to ods raw parameters.
Parameters:
ods – input ods to which data is added
time_index – time index to which data is added to ods
time – time in seconds where to compare the data (if set it superseeds time_index)
Plots pressure constrait in kEQDSK.
For general information on K-FILES, see
- https://fusion.gat.com/theory/Efitinputs
- https://fusion.gat.com/theory/Efitin1
Specific quantities related to extracting the pressure profile
——
KPRFIT: kinetic fitting mode: 0 off, 1 vs psi, 2 vs R-Z, 3 includes rotation
NPRESS: number of valid points in PRESSR; positive number: number of input data, negative number:
read in data from EDAT file, 0: rotational pressure only
RPRESS: -: input pressure profile as a function of dimensionless fluxes (psi_N), +:
R coordinates of input pressure profile in m
ZPRESS: gives Z coordinates to go with R coordinates in RPRESS if RPRESS>0
PRESSR: pressure in N/m^2 (or Pa) vs. normalized flux (psi_N) for fitting
PRESSBI: pressure at boundary
SIGPREBI: standard deviation for pressure at boundary PRESSBI
KPRESSB: 0: don’t put a pressure point at boundary (Default), 1: put a pressure point at the boundary
Specific quantities related to understanding KNOTS & basis functions
——
KPPFNC basis function for P’: 0 = polynomial, 6 = spline
KPPCUR number of coefficients for poly representation of P’, ignored if spline. Default = 3
KPPKNT number of knots for P’ spline, ignored unless KPPFNC=6
PPKNT P’ knot locations in psi_N, vector length KPPKNT, ignored unless KPPFNC=6
PPTENS spline tension for P’. Large (like 10) —> approaches piecewise linear. small (like 0.1)
—> like a cubic spline
KPPBDRY constraint switch for P’. Vector of length KPPKNT. Ignored unless KPPFNC=6
PPBDRY values of P’ at each knot location where KPPBDRY=1
KPP2BDRY on/off for PP2BDRY
PP2BDRY values of (P’)’ at each knot location where KPP2BDRY=1
Parameters:
in1 – NamelistName instance
fig – Figure instance (unused, but accepted to maintain consistent format)
Documentation on fast ion information in K-FILES:
https://fusion.gat.com/theory/Efitin1https://fusion.gat.com/theory/Efitinputs
—
KPRFIT: kinetic fitting mode: 0 off, 1 vs psi, 2 vs R-Z, 3 includes rotation
NBEAM: number of points for beam data in kinetic mode (in vector DNBEAM)
DNBEAM: beam particle density for kinetic EFIT
PBEAM: beam pressure in Pa vs psi_N for kinetic fitting
PNBEAM: defaults to 0. That is all we know
SIBEAM: psi_N values corresponding to PBEAM
>0: number of constraints to apply
0: don’t apply constraints (default)
SIZEROJ: vector of locations at which Jt is constrainted when KZEROJ>0.
When KZEROJ=1, PSIWANT can be used instead of SIZEROJ(1) by setting SIZEROJ(1)<0
see KZEROJ, RZEROJ, VZEROJ, PSIWANT
default SIZEROJ(1)=-1.0
RZEROJ: vector of radii at which to apply constraints.
For each element in vector & corresponding elements in SIZEROJ, VZEROJ, if
RZEROJ>0: set Jt=0 @ coordinate RZEROJ,SIZEROJ
RZEROJ=0: set flux surface average current equal to VZEROJ @ surface specified by normalized flux SIZEROJ
RZEROJ<0: set Jt=0 @ separatrix
applied only if KZEROJ>0. Default RZEROJ(1)=0.0
If KZEROJ=1, may specify SIZEROJ(1) w/ PSIWANT. If KZEROJ=1 and SIZEROJ(1)<0 then SIZEROJ(1) is set equal to PSIWANT
PSIWANT: normalized flux value of surface where J constraint is desired.
See KZEROJ, RZEROJ, VZEROJ.
Default=1.0
VZEROJ: Desired value(s) of normalized J (w.r.t. I/area) at
the flux surface PSIWANT (or surfaces SIZEROJ).
Must have KZEROJ = 1 or >1 and RZEROJ=0.0.
Default=0.0
summary: you should set k to some number of constraint points, then use the SIZEROJ and VZEROJ vectors to set up the psi_N and Jt values at the constraint points
KNOTS & basis functions
KFFFNC basis function for FF’: 0 polynomial, 6 = spline
ICPROF specific choice of current profile: 0 = current profile is not specified by this variable,
1 = no edge current density allowed
2 = free edge current density
3 = weak edge current density constraint
KFFCUR number of coefficients for poly representation of FF’, ignored if spline. Default = 1
KFFKNT number of knots for FF’. Ignored unless KFFFNC=6
FFKNT knot locations for FF’ in psi_N, vector length should be KFFKNT. Ignored unless kfffnc=6
FFTENS spline tension for FF’. Large (like 10) —> approaches piecewise linear. small (like 0.1) —> like a cubic spline
KFFBDRY constraint switch for FF’ (0/1 off/on) for each knot. default to zeros
FFBDRY value of FF’ for each knot, used only when KFFBDRY=1
KFF2BDRY: on/off (1/0) switch for each knot
FF2BDRY value of (FF’)’ for each knot, used only when KFF2BDRY=1
K-FILES
plot MSE constraints
see https://fusion.gat.com/theory/Efitinputs
RRRGAM R in meters of the MSE observation point
ZZZGAM Z in meters of the MSE observation point
TGAMMA “tangent gamma”. Tangent of the measured MSE polarization angle, TGAMMA=(A1*Bz+A5*Er)/(A2*Bt+…)
SGAMMA standard deviation (uncertainty) for TGAMMA
FWTGAM “fit weight gamma”: 1/0 on/off switches for MSE channels
DTMSEFULL full width of MSE dat time average window in ms
AA#GAM where # is 1,2,3,…: geometric correction coefficients for MSE data, generated by EFIT during mode 5
K-files plot mass density profile
see https://fusion.gat.com/theory/Efitin1
NMASS: number of valid points in DMASS
DMASS: density mass. Mass density in kg/m^3
I am ASSUMING that this uses RPRESS (psi or R_major for pressure constraint) to get the position coordinates
Plot manager for k-file class OMFITkeqdsk
Function used to decide what real plot function to call and to apply generic plot labels.
You can also access the individual plots directly, but you won’t get the final annotations.
EFIT k-file inputs are documented at https://fusion.gat.com/theory/Efitinputs
:param plottype: string
What kind of plot?
‘everything’
‘pressure and current’
‘pressure’
‘current’
‘fast ion density’
‘fast ion pressure’
‘mse’
‘mass density’
Parameters:
fig – [Optional] Figure instance
Define fig and ax to override automatic determination of plot output destination.
ax – [Optional] Axes instance or array of Axes instances
Define fig and ax to override automatic determination of plot output destination.
Ignored if there are not enough subplots to contain the plots ordered by plottype.
label – [Optional] string
Provide a custom label to include in legends. May be useful when overlaying two k-files.
Default: ‘’. Set label=None to disable shot and time in legend entries.
no_extra_info_in_legend – bool
Do not add extra text labels to the legend to display things like Bt, etc.
Tests whether residuals conform to normal distribution and saves a P-value.
A good fit should have random residuals following a normal distribution (due to random measurement errors
following a normal distribution). The P-value assesses the probability that the distribution of residuals could
be at least as far from a normal distribution as are the measurements. A low P-value is indicative of a bad
model or some other problem.
https://www.graphpad.com/guides/prism/5/user-guide/prism5help.html?reg_diagnostics_tab_5_3.htmhttps://www.graphpad.com/support/faqid/1577/
Parameters:
which – string
Parameter to do test on. Options: [‘mag’, ‘ecc’, ‘fcc’, ‘lop’, ‘gam’, ‘pre’, ‘xxj’]
is_tmp – string
How many of the stats quantities should be loaded as OMFITncDataTmp (not saved to disk)?
‘some’, ‘none’, or ‘all’
Calculates R^2 value for fit to a category of signals (like magnetics).
The result will be saved into the mEQDSK instance as stats_rsq***. R^2 measures the fraction of variance in the
data which is explained by the model and can range from 1 (model explains data) through 0 (model does no better
than a flat line through the average) to -1 (model goes exactly the wrong way).
https://www.graphpad.com/guides/prism/5/user-guide/prism5help.html?reg_diagnostics_tab_5_3.htm
Combined R^2 from various groups of signals, including ‘all’.
Needs stats_sst* and stats_ssr* and will call rsq_test to make them if not already available.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Trace flux surfaces and calculate flux-surface averaged and geometric quantities
Inputs can be tables of PSI and Bt or an OMFITgeqdsk file
Parameters:
Rin – (ignored if gEQDSK!=None) array of the R grid mesh
Zin – (ignored if gEQDSK!=None) array of the Z grid mesh
PSIin – (ignored if gEQDSK!=None) PSI defined on the R/Z grid
Btin – (ignored if gEQDSK!=None) Bt defined on the R/Z grid
Rcenter – (ignored if gEQDSK!=None) Radial location where the vacuum field is defined ( B0 = F[-1] / Rcenter)
F – (ignored if gEQDSK!=None) F-poloidal
P – (ignored if gEQDSK!=None) pressure
rlim – (ignored if gEQDSK!=None) array of limiter r points (used for SOL)
zlim – (ignored if gEQDSK!=None) array of limiter z points (used for SOL)
gEQDSK – OMFITgeqdsk file or ODS
resolution – if int the original equilibrium grid will be multiplied by (resolution+1), if float the original equilibrium grid is interpolated to that resolution (in meters)
forceFindSeparatrix – force finding of separatrix even though this may be already available in the gEQDSK file
levels – levels in normalized psi. Can be an array ranging from 0 to 1, or the number of flux surfaces
map – array ranging from 0 to 1 which will be used to set the levels, or ‘rho’ if flux surfaces are generated based on gEQDSK
maxPSI – (default 0.9999)
calculateAvgGeo – Boolean which sets whether flux-surface averaged and geometric quantities are automatically calculated
quiet – Verbosity level
**kw – overwrite key entries
>> OMFIT[‘test’]=OMFITgeqdsk(OMFITsrc+’/../samples/g133221.01000’)
>> # copy the original flux surfaces
>> flx=copy.deepcopy(OMFIT[‘test’][‘fluxSurfaces’])
>> # to use PSI
>> mapping=None
>> # to use RHO instead of PSI
>> mapping=OMFIT[‘test’][‘RHOVN’]
>> # trace flux surfaces
>> flx.findSurfaces(np.linspace(0,1,100),mapping=map)
>> # to increase the accuracy of the flux surface tracing (higher numbers –> smoother surfaces, more time, more memory)
>> flx.changeResolution(2)
>> # plot
>> flx.plot()
packing – if levels is integer, packing of flux surfaces close to the separatrix
resolution – accuracy of the flux surface tracing
rlim – list of R coordinates points where flux surfaces intersect limiter
zlim – list of Z coordinates points where flux surfaces intersect limiter
open_flx – dictionary with flux surface rhon value as keys of where to calculate SOL (passing this will not set the sol entry in the flux-surfaces class)
Function used to generate boundary shapes based on T. C. Luce, PPCF, 55 9 (2013)
Direct Python translation of the IDL program /u/luce/idl/shapemaker3.pro
Parameters:
a – minor radius
eps – aspect ratio
kapu – upper elongation
lkap – lower elongation
delu – upper triangularity
dell – lower triangularity
zetaou – upper outer squareness
zetaiu – upper inner squareness
zetail – lower inner squareness
zetaol – lower outer squareness
zoffset – z-offset
upnull – toggle upper x-point
lonull – toggle lower x-point
npts – int
number of points (per quadrant)
doPlot – plot boundary shape construction
newsq – A 4 element array, into which the new squareness values are stored
gEQDSK – input gEQDSK to match (wins over rbbbs and zbbbs)
verbose – print debug statements
npts – int
Number of points
Returns:
dictionary with parameters to feed to the boundaryShape function
[a, eps, kapu, kapl, delu, dell, zetaou, zetaiu, zetail, zetaol, zoffset, upnull, lonull]
gEQDSK – input gEQDSK to match (wins over rbbbs and zbbbs)
verbose – print debug statements
doPlot – visualize match
precision – optimization tolerance
npts – int
Number of points
Returns:
dictionary with parameters to feed to the boundaryShape function
[a, eps, kapu, kapl, delu, dell, zetaou, zetaiu, zetail, zetaol, zoffset, upnull, lonull]
This class is a subclass of OMFITobject and is used in OMFIT
when loading of an OMFITobject subclass object goes wrong
during the loading of a project.
Note that the orifinal file from which the loading failed is not lost
but can be accessed from the .filename attribute of this object.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Format Python string according to OMFIT style
Based on BLACK: https://github.com/psf/black with 140 chars
Equivalent to running: black -S -l 140 -t py36 filename
NOTE: some versions of black has issues when a comma trails a parenthesis
Version 19.3b0 is ok
Parameters:
content – string with Python code to format
Returns:
formatted Python code
None if nothing changed
False if formatting was skipped due to an InvalidInput
Format Python file according to OMFIT style
Based on BLACK: https://github.com/psf/black with 140 chars
Equivalent to running: black -S -l 140 -t py36 filename
Parameters:
filename – filename of the Python file to format
If a directory is passed, then all files ending with .py will be processed
overwrite – overwrite original file or simply return if the file has changed
Returns:
formatted Python code
None if nothing changed
False if style enforcement is skipped or the input was invalid
If a directory, then a dictionary with each processed file as key is returned
Class used to interface with GYRO results directory
This class extends the OMFITgyro class with the save/load methods of the OMFITpath class
so that the save/load carries all the original files from the GYRO output
Parameters:
filename – directory where the GYRO result files are stored.
The data in this directory will be loaded upon creation of the object.
extra_files – Any extra files that should be downloaded from the remote location
test_mode – Don’t raise an exception if out.gyro.t is not present (and abort loading at that point)
key – function that returns a string that is used for sorting or dictionary key whose content is used for sorting
>> tmp=SortedDict()
>> for k in range(5):
>> tmp[k]={}
>> tmp[k][‘a’]=4-k
>> # by dictionary key
>> tmp.sort(key=’a’)
>> # or equivalently
>> tmp.sort(key=lambda x:tmp[x][‘a’])
Parameters:
**kw – additional keywords passed to the underlying list sort command
key – function that returns a string that is used for sorting or dictionary key whose content is used for sorting
>> tmp=SortedDict()
>> for k in range(5):
>> tmp[k]={}
>> tmp[k][‘a’]=4-k
>> # by dictionary key
>> tmp.sort(key=’a’)
>> # or equivalently
>> tmp.sort(key=lambda x:tmp[x][‘a’])
Parameters:
**kw – additional keywords passed to the underlying list sort command
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
OMFIT class used to interface to equilibria files generated by GATO (.dskgato files)
NOTE: this object is “READ ONLY”, meaning that the changes to the entries of this object will not be saved to a file. Method .save() could be written if becomes necessary
Parameters:
filename – filename passed to OMFITobject class
**kw – keyword dictionary passed to OMFITobject class
Function used to plot dskgato-files.
This plot shows flux surfaces in the vessel, pressure, q profiles, P’ and FF’
Parameters:
usePsi – In the plots, use psi instead of rho
only2D – only make flux surface plot
levels – list of sorted numeric values to pass to 2D plot as contour levels
label – plot item label to apply lines in 1D plots (only the q plot has legend called by the geqdsk class
itself) and to the boundary contour in the 2D plot (this plot doesn’t call legend by itself)
ax – Axes instance to plot in when using only2D
**kw – Standard plot keywords (e.g. color, linewidth) will be passed to Axes.plot() calls.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
token – string or None
Token for accessing GitHub
None triggers attempt to decrypt from $GHUSER@token.github.com credential file
Must be set up in advance with set_OMFIT_GitHub_token() function
org – string [optional] The organization that the repo is under, like ‘gafusion’.
If None, attempts to lookup with method based on git rev-parse.
Falls back to gafusion on failure.
repository – string [optional]
The name of the repo on GitHub.
If None, attempts to lookup with method based on git rev-parse.
Falls back to OMFIT-source on failure.
path – string
The part of the repo api to access
token – string or None
Token for accessing GitHub
None triggers attempt to decrypt from file (must be set up in advance).
Passed to get_OMFIT_GitHub_token.
selection – dict
A dictionary such as {‘state’:’all’}
Gets the pull request number associated for the current git branch if there is an open pull request.
Passes parameters org, destination_org, branch, and repository to get_git_branch_info().
Parameters:
return_info – bool [optional]
Return a dictionary of information instead of just the pull request number
Returns:
int, dict-like, or None
Pull request number if one can be found, otherwise None.
If return_info: dictionary returned by OMFITgithub_paged_fetcher with ‘org’ key added. Contains ‘number’, too.
Looks up local name for upstream remote repo, GitHub org, repository name, current branch, & open pull request info
All parameters are optional and should only be provided if trying to override some results
Parameters:
remote – string [optional]
Local name for the upstream remote.
If None, attempts to lookup with method based on git rev-parse.
org – string [optional]
The organization that the repo is under, like ‘gafusion’.
If None, attempts to lookup with method based on git rev-parse.
Falls back to gafusion on failure.
destination_org – string [optional]
Used for cross-fork pull requests: specify the destination org of the pull request.
The pull request actually exists on this org, but it is not where the source branch lives.
If None it defaults to same as org
repository – string [optional]
The name of the repo on GitHub.
If None, attempts to lookup with method based on git rev-parse.
Falls back to OMFIT-source on failure.
branch – string [optional]
Local/remote name for the current branch
NOTE: there is an assumption that the local and upstream branches have same name
url – string [optional]
Provided mainly for testing.
Overrides the url that would be returned by git config –get remote.origin.url.
omfit_fallback – bool
Default org and repository are gafusion and OMFIT-source instead of None and None in case of failed lookup.
no_pr_lookup – bool
Improve speed by skipping lookup of pull request number
return_pr_destination_org – bool
If an open pull request is found, changes remote, org, repository,
and branch to match the destination side of the pull request.
If there is no pull request or this flag is False,
remote/org/repo/branch will correspond to the source.
server – string [optional]
The server of the remote - usually github.com, but could also be something like vali.gat.com.
Returns:
tuple containing 4 strings and a dict, with elements to be replaced with None for lookup failure
remote (str), org (str), repository (str), branch (str), pull_request_info (dict)
thread – int [optional]
The number of the issue or pull request within the fork of interest
If not supplied, the current branch name will be used to search for open pull requests on GitHub.
comment – string
The comment to be posted
org – string [optional]
Leave this as gafusion to post to the main OMFIT repo.
Enter something else to post on a fork.
fork – string [optional]
Redundant with org. Use org instead. Fork is provided for backwards compatibility
repository – string [optional]
The name of the repo on GitHub.
If None, attempts to lookup with method based on git rev-parse.
Falls back to OMFIT-source on failure.
You should probably leave this as None unless you’re doing some testing,
in which case you may use the regression_notifications repository under gafusion.
token – string or None
Token for accessing GitHub
None triggers attempt to decrypt from $GHUSER@token.github.com credential file
Must be set up in advance with set_OMFIT_GitHub_token() function
Returns:
response instance
As generated by requests.
It should have a status_code attribute, which is normally int(201) for successful
GitHub posts and probably 4xx for failures.
Looks up comments on a GitHub issue or pull request and searches for ones with body text matching contains
Parameters:
thread – int or None
int: issue or pull request number
None: look up pull request number based on active branch name. Only works if a pull request is open.
contains – string or list of strings
Check for these strings within comment body text. They all must be present.
user – bool or string [optional]
True: only consider comments made with the current username (looks up GITHUB_username from MainSettings)
string: only comments made by the specified username.
id_only – bool
Return only the comment ID numbers instead of full dictionary of comment info
org – string [optional] The organization that the repo is under, like ‘gafusion’.
If None, attempts to lookup with method based on git rev-parse.
Falls back to gafusion on failure.
repository – string [optional]
The name of the repo on GitHub. If None, attempts to lookup with
method based on git rev-parse. Falls back to OMFIT-source on failure.
**kw – keywords to pass to OMFITgithub_paged_fetcher
Returns:
list of dicts (id_only=False) or list of ints (id_only=True)
Deletes GitHub comments that contain a keyword. Use CAREFULLY for clearing obsolete automatic test report posts.
Parameters:
thread – int [optional]
Supply issue or comment number or leave as None to look up an open pull request # for the active branch
keyword – string or list of strings
CAREFUL! All comments which match this string will be deleted.
If a list is provided, every substring in the list must be present in a comment.
test – bool
Report which comments would be deleted without actually deleting them.
token – string or None
Token for accessing GitHub
None triggers attempt to decrypt from $GHUSER@token.github.com credential file
Must be set up in advance with set_OMFIT_GitHub_token() function
org – string [optional] The organization that the repo is under, like ‘gafusion’.
If None, attempts to lookup with method based on git rev-parse.
Falls back to gafusion on failure.
repository – string [optional]
The name of the repo on GitHub.
If None, attempts to lookup with method based on git rev-parse.
Falls back to OMFIT-source on failure.
quiet – bool
Suppress print output
exclude – list of strings [optional]
List of CIDs to exclude / protect from deletion. In addition to actual CIDs, the special value of ‘latest’ is
accepted and will trigger lookup of the matching comment with the most recent timestamp. Its CID will replace
‘latest’ in the list.
exclude_contain – list of strings [optional]
If provided, comments must contain all of the strings listed in their body in order to qualify for exclusion.
match_username – bool or string [optional]
True: Only delete comments that match the current username.
string: Only delete comments that match the specified username.
**kw – keywords to pass to find_gh_comments()
Returns:
list of responses from requests (test=False) or list of dicts with comment info (test=True)
response instances should have a status_code attribute, which is normally int(201) for successful
GitHub posts and probably 4xx for failures.
Edits GitHub comments to update automatically generated information, such as regression test reports.
Parameters:
comment_mark – str or None
None: edit top comment.
str: Search for a comment (not including top comment) containing comment_mark as a substring.
new_content –
str
New content to put into the comment
Special cases:
If content is None and mode == ‘replace_between’:
Separate close separator and close separator present in target: del between 1st open sep and next close sep
Separate close separator & close separator not present in target: del everything after 1st open sep
All same separator & one instance present in target: delete it and everything after
All same separator & multiple instances present: delete the first two and everything in between.
If content is None and mode != ‘replace_between’, raise ValueError
If None and mode == ‘replace_between’, but separator not in comment, abort but do not raise.
separator – str or list of strings
Substring that separates parts that should be edited from parts that should not be edited.
‘—’ will put a horizontal rule in GitHub comments, which seems like a good idea for this application.
If this is a list, the first and second elements will be used as the opening and closing separators to allow for
different open/close around a section.
mode –
str
Replacement behavior.
‘replace’: Replace entire comment. Ignores separator.
‘append’: Append new content to old comment. Places separator between old and new if separator is supplied.
Closes with another separator if separate open/close separators are supplied; otherwise just places one
separator between the original content and the addition.
’replace_between: Replace between first & second appearances of separator (or between open & close separators).
Raises ValueError if separator is not specified.
Acts like mode == ‘append’ if separator (or opening separator) is not already present in target comment.
other: raises ValueError
thread – int [optional]
Issue or pull request number. Looked up automatically if not provided
org – str [optional]
GitHub org. Determined automatically if not supplied.
repository – str [optional]
GitHub repository. Determined automatically if not supplied.
token – str
Token for accessing GitHub. Decrypted automatically if not supplied.
Returns:
response instance or None
None if aborted before attempting to post, otherwise response instance, which is an object generated
by the requests module. It should have a status_code attribute which is 2** on success
and often 4** for failures.
Updates the status of a pull request on GitHub. Appears as green check mark or red X at the end of the thread.
Parameters:
org – string [optional] The organization that the repo is under, like ‘gafusion’.
If None, attempts to lookup with method based on git rev-parse.
Falls back to gafusion on failure.
destination_org – string [optional]
Used for cross-fork pull requests: specify the destination org of the pull request.
The pull request actually exists on this org, but it is not where the source branch lives.
Passed to get_pull_request_number when determining whether a pull request is open.
Defines first org to check.
If None it defaults to same as org
repository – string [optional]
The name of the repo on GitHub.
If None, attempts to lookup with method based on git rev-parse.
Falls back to OMFIT-source on failure.
commit –
commit hash or keyword
‘latest’ or ‘HEAD~0’:
Look up latest commit. This is appropriate if you have reloaded modules and classes as needed.
’omfit’ or None:
use commit that was active when OMFIT launched.
Appropriate if testing classes as loaded during launch and not reloaded since.
else: treats input as a commit hash and will fail if it is not one.
state – string or bool
‘success’ or True: success -> green check
‘failure’ or False: problem -> red X
‘pending’ -> yellow circle
context – string
Match the context later to update the status
description – string
A string with a description of the status. Up to 140 characters. Long strings will be truncated.
Line breaks, quotes, parentheses, etc. are allowed.
target_url – string
URL to associate with the status
Returns:
response instance s generated by requests.
It should have a status_code attribute, which is normally int(201) for successful GitHub posts and probably 4xx for failures.
Compared to SQL tables, Google sheets allow for formatting and other conveniences
that make them easier to update by hand (appropriate for some columns like labels,
comments, etc.), but the API for Google sheets is relatively simple. This file
is meant to provide tools for looking up data by column header instead of column
number/letter and by shot instead of row number. That is, the user should not have
to know which row holds shot 123456 or which column holds density in order to get
the average density for shot 123456.
In this script, the purely numeric way to refer to cell ‘A1’ is (0, 0). Do not
allow any reference to A1 as (1, 1) to creep into this code or it will ruin
everything. Referring to (‘A’, 1) is fine; there’s a function for interpreting
cell references, but it uses the presence of a letter in the column as a cue that
the row should be counted from 1, so it can’t handle counting columns from 1.
Some packages DO use numbers (from 1) instead of indices (from 0) internally. If
one of these packages is used, its returned values must be converted immediately.
YOU WON’T LIKE WHAT HAPPENS IF YOU FAIL TO MAINTAIN THIS DISCIPLINE.
A sample call that should work to start up an OMFITgoogleSheet instance is:
>> gsheet = OMFITgoogleSheet(
>> keyfile=os.sep.join([OMFITsrc, ‘..’, ‘samples’, ‘omfit-test-gsheet_key.json’]),
>> sheet_name=’OmfitDataSheetTestsheet’,
>> subsheet_name=’Sheet1’, # Default: lookup first subsheet
>> column_header_row_idx=5, # Default: try to guess
>> units_row_idx=6, # Default: None
>> data_start_row_idx=7, # Default: header row + 1 (no units row) or + 2 (if units row specified)
>> )
This call should connect to the example sheet. This is more than an example; this is a functional call
that is read out of the docstring by the regression test and testing will fail if it doesn’t work properly.
Parameters:
filename – str
Not used, but apparently required when subclassing from OMFITtree.
keyfile – str or dict-like
Filename with path of the file with authentication information,
or dict-like object with the parsed file contents (OMFITjson should work well).
See setup_help for help setting this up.
sheet_name – str
Name of the Google sheets file/object/whatever to access
subsheet_name – str
Sheet within the sheet (the tabs at the bottom). Defaults to the first tab.
column_header_row_idx – int
Index (from 0) of the row with column headers. If not specified, we will try to guess for you.
Indices like this are stored internally.
column_header_row_number – int
Number (from 1) of the row with column headers, as it appears in the sheet.
Ignored if column_header_row_idx is specified. If neither is specified, we will try to guess for you.
This will be converted into an index (from 0) for internal use.
units_row_idx – int or None
Index (from 0) of the row with the units, if there is one, or None if there isn’t a row for units.
units_row_number – int or None
Number (from 1) of the units row. See description of column_header_row_number.
data_start_row_idx – int
Index (from 0) of the row where data start.
Defaults to column_header_row + 1 if units_row is None or +2 if units_row is not None.
data_start_row_idx – int
Number (from 1) of the first row of data after the header.
See description of column_header_row_number.
kw – additional keywords passed to super class’s __init__
Makes sure setup is acceptable and raises AssertionError otherwise
Parameters:
essential_only – bool
Skip checks that aren’t essential to initializing the class and its connection.
This avoids spamming warnings about stuff that might get resolved later in setup.
value – numeric or string
This will be written to the cell
args – address information, such as (column, row) or address string.
Examples: (‘A’, 1), (0, 0), ‘A1’.
See interpret_row_col() for more information about how to specify row and column.
Updates the cached representation if forced or if it seems like it’s a good idea
If it’s been a while since the cache was updated, do another update.
If we know there was a write operation since the last update, do another update.
If the cache is missing, grab the data and make the cache.
If force=True, do another update.
Parameters:
force – bool
Force an update, even if other indicators would lead to skipping the update
By default, the local cache will be checked & updated if needed and then data will be read from the cache.
Caching can be disabled, which will result in more connections to the remote sheet. Some other methods have
been programmed assuming that local caching done here will save them from otherwise inefficient layout of
calls to this function.
Parameters:
column – int or str
Column index (from 0) or column letter (like ‘A’, ‘B’, … ‘ZZ’)
force_update_cache – bool
Force the cache to update before reading
disable_cache – bool
Don’t go through the local cache; read the remote column directly
Extends OMFITgoogleSheet by assuming each row corresponds to a shot, allowing more features to be provided.
This is less general (now Shot must exist), but it may be more convenient
to look up data by shot this way. With more assumptions about the structure
of the sheet, more methods can be implemented.
Many methods go further to assume that there is a column named ‘Device’ and
that there are two columns that can be used to determine a time range. Parts
of the class should work on sheets without this structure, but others will
fail.
A sample call that should work to start up an OMFITgoogleSheet instance is:
>> xtable = OMFITexperimentTable(
>> keyfile=os.sep.join([OMFITsrc, ‘..’, ‘samples’, ‘omfit-test-gsheet_key.json’]),
>> sheet_name=’OmfitDataSheetTestsheet’,
>> subsheet_name=’Sheet1’, # Default: lookup first subsheet
>> column_header_row_idx=5, # Default: try to guess
>> units_row_idx=6, # Default: search for units row
>> data_start_row_idx=7, # Default: find first number after the header row in the shot column
>> )
This call should connect to the example sheet. These data are used in regression tests and should be updated
If the test sheet is changed.
Makes sure setup is acceptable and raises AssertionError otherwise
Parameters:
essential_only – bool
Skip checks that aren’t essential to initializing the class and its connection.
This avoids spamming warnings about stuff that might get resolved later in setup.
This is a simplification of the parent class’s methods, because here we explicitly require a ‘Shot’ column.
It should be faster and more reliable this way.
Recommended: put a row of units under the column headers (data won’t start on the next row after headers)
Were recommendations followed? See if 1st row under headers has “units” under a
column named Shot, id, or ID. In this case, data start on the row after units.
This function is easier to implement under OMFITexperimentTable instead of
OMFITgoogleSheet because we are allowed to make additional assumptions about the
content of the table.
If there is no cell containing just ‘units’, ‘Units’, or ‘UNITS’, but you do
have a row for units, you should specify units_row at init.
Allowing this to be different from header + 1 means meta data can be inserted under the column headers.
Since this class imposes the requirement that the “Shot” column exists & has shot numbers, we can find the
first valid number in that column, after the header row, and be safe in assuming that the data start there.
To allow for nonstandard configurations of blank cells, please specify data_start_row manually in init.
Downloads data for a given column and returns them with units
Parameters:
column – str or int
str: Name of the column; index will be looked up
int: Index of the column, from 0. Name of the column can be looked up easily.
pad_with – str
Value to fill in to extend the column if it is too short.
Truncation of results to the last non-empty cell is possible, meaning columns
can have inconsistent lengths if some have more empty cells than others.
force_type – type [optional]
If provided, pass col_values through list_to_array_type() to force them to match force_type.
fill_value – object
Passed to list_to_array_type(), if used; used to fill in cells that can’t be forced to force_type.
It probably would make sense if this were the same as pad_with, but it doesn’t have to be.
It does have to be of type force_type, though.
fill_if_not_found – str
Value to fill in if data are not found, such as if column_name is None.
Returns:
(array, str or None)
values in the column, padded or cut to match the number of shots
replaced by fill_if_not_found in case of bad column specifications
replaced by fill_value for individual cells that can’t be forced to type force_type
extended by pad_with as needed
Gets MDSplus data for a list of signals, performs operations on specified time ranges, & saves results.
This sample signal can be used as an instructional example or as an input for testing
>> sample_signal_request = dict(
>> column_name=’density’, # The sheet must have this column (happens to be a valid d3d ptname)
>> TDI=dict(EAST=r’dfsdev’), # d3d’s TDI isn’t listed b/c TDI defaults to column_name
>> treename=dict(EAST=’PCS_EAST’), # DIII-D isn’t listed b/c default is None, which goes to PTDATA
>> factor={‘DIII-D’: 1e-13}, # This particular EAST pointname doesn’t require a factor; defaults to 1
>> tfactor=dict(EAST=1e3), # d3d times are already in ms
>> tmin=’t min’, # str means read timing values from column w header exactly matching this string
>> tmax=’t max’,
>> )
This sample should work with the example/test sheet.
If the sheet in question had ONLY DIII-D data, the same signal request could be accomplished via:
>> simple_sample_signal_request = dict(column_name=’density’, factor=1e-13, tmin=’t min’, tmax=’t max’)
This sample won’t work with the test sheet; it will fail on the EAST shots.
We are exploiting the shortcut that we’ve used a valid pointname (valid for d3d at least)
as the column header. If you want fancy column names, you have to specify TDIs.
We are also relying on the defaults working for this case.
The sample code in this docstring is interpreted and used by the regression test, so don’t break it.
Separate different samples with non-example lines (that don’t start with >>)
Parameters:
signals –
list of dict-likes
Each item in this list should be a dict-like that contains information needed to fetch & process data.
SIGNAL SPECIFICATION
- column_name: str (hard requirement)
Name of the column within the sheet
TDI: None, str, or dict-like (optional if default behavior (reuse column_name) is okay)
None: reuse column_name as the TDI for every row
str: use this str as the pointname/TDI for every row
dict-like: keys should be devices. Each device can have its own pointname/TDI.
The sheet must have a Device row. If a device is missing, it will inherit the column_name.
treename: None, str, or dict-like (optional if default behavior (d3d ptdata) is okay)
None: PTDATA (DIII-D only)
str: use this string as the treename for every row
dict-like: device specific treenames. The sheet must have a Device row.
factor: float or dict-like (defaults to 1.0)
float: multiply results by this number before writing
dict-like: multiply results by a device specific number before writing (unspecified devices get 1)
tfactor: float or dict-like (defaults to 1.0)
float: multiply times by this number to get ms
dict-like: each device gets a different factor used to put times in ms (unspecified devices get 1)
PROCESSING
- operation: str (defaults to ‘mean’)
Operation to perform on the gathered data. Options are:
- ‘mean’
- ‘median’
- ‘max’
- ‘min’
tmin: float or str
Start of the time range in ms; used for processing operations like average.
Must be paired with tmax.
A usable tmin/tmax pair takes precedence over time+dt.
A float is used directly. A string triggers lookup of a column in the sheet; then every row gets
its own number determined by its entry in the specified column.
tmax: float or str
End of the time range in ms. Must be paired with tmin.
time: float or str
Center of a time range in ms. Ignored if tmin and tmax are supplied. Must be paired with dt.
dt: float or str
Half-width of time window in ms. Must be paired with time.
overwrite – bool
Update the target cell even if it’s not empty?
device_servers – dict-like [optional]
Provide alternative MDS servers for some devices. A common entry might be {‘EAST’: ‘EAST_US’}
to use the ‘EAST_US’ (eastdata.gat.com) server with the ‘EAST’ device.
Wrapper for write_mds_data_to_table for quickly setting up EFIT signals.
Assumes that device-specific variations on EFIT pointnames and primary trees
are easy to guess, and that you have named your columns to exactly match EFIT
pointnames in their “pure” form (no leading ).
Basically, it can build the signal dictionaries for you given a list of
pointnames and some timing instructions.
Here is a set of sample keywords that could be passed to this function:
>> sample_kw = dict(
>> pointnames=[‘betan’],
>> tmin=’t min’, # Can supply a constant float instead of a string
>> tmax=’t max’,
>> overwrite=True,
>> device_servers=dict(EAST=’EAST_US’), # Probably only needed for EAST
>> )
These should work in xtable.write_efit_result where xtable is an
OMFITexperimentTable instance connected to a google sheet that contains columns
with headers ‘betan’, ‘t min’, and ‘t max’
:param pointnames; list of strings
Use names like betan, etc. This function will figure out whether you really need betan instead.
Parameters:
kw –
more settings, including signal setup customizations.
* MUST INCLUDE tmin & tmax OR time & dt!!!!!!!!! <—- don’t forget to include timing data
* Optional signal customization: operation, factor
* Remaining keywords (other than those listed so far) will be passed to write_mds_data_to_table
* Do not include column_name, TDI, treename, or tfactor, as these are determined by this function.
If you need this level of customization, just use write_mds_data_to_table() directly.
This method can detect if .filename was changed and if so, makes a copy from the original .filename
(saved in the .link attribute) to the new .filename
This method can detect if .filename was changed and if so, makes a copy from the original .filename
(saved in the .link attribute) to the new .filename
Child of OMFITncDataset, which is a hybrid xarray Dataset and OMFIT SortedDict object.
This one updates the GPEC naming conventions when it is loaded, and locks the data
by default so users don’t change the code outputs.
Parameters:
filename – Path to file
lock – Prevent in memory changes to the DataArray entries contained
exportDataset_kw – dictionary passed to exportDataset on save
This method can detect if .filename was changed and if so, makes a copy from the original .filename
(saved in the .link attribute) to the new .filename
host – harvesting server address
If None take value from HARVEST_HOST environemental variable, or use default gadb-harvest.duckdns.org if not set.
port – port the harvesting server is listening on.
If None take value from HARVEST_PORT environemental variable, or use default 0 if not set.
verbose – print harvest message to screen
If None take value from HARVEST_VERBOSE environemental variable, or use default False if not set.
tag – tag entry
If None take value from HARVEST_TAG environemental variable, or use default Null if not set.
protocol – transmission protocol to be ued (UDP or TCP)
If None take value from HARVEST_PROTOCOL environemental variable, or use default UDP if not set.
process – function passed by user that is called on each of the payload elements prior to submission
OMFIT class that translates HDF5 file to python dictionary
At this point this class is read only. Changes made to the
its content will not be reflected to the HDF5 file.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]
If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v
In either case, this is followed by: for k in F: D[k] = F[k]
Method to read 1D vectors from HELENA output file.
param f: File to read the data. It is assumed that the file is at
the right position to start reading
param dataident: a list containing 4 elements:
[0] : names of the data to be read. The global 1d dictionary will use these names.
[1] : The column indicating the location of the psinorm vector
[2] : The exponent needed to produce psinorm :
1 = the data in file already is already psinorm
2 = the data is in sqrt(psinorm)
[3] : Column numbers for the data
[4] : A string indicating the end of data
Reads a dskgato file, extracts the global parameters and the
boundary shape from that. Calculates HELENA parameters and
reconstructs the boundary using the fourier representation.
OMFIT class used to interface with IDL .sav files
Note that these objects are “READ ONLY”, meaning that the changes to the entries of this object will not be saved to a file.
This class is based on a modified version of the idlsave class provided by https://github.com/astrofrog/idlsave
The modified version (omfit/omfit_classes/idlsaveOM.py) returns python dictionaries instead of np.recarray objects
Parameters:
filename – filename passed to OMFITobject class
**kw – keyword dictionary passed to OMFITobject class
Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to ‘strict’.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Search for avaliable IRI runs by device and shot number. Optionally, also search with run_by and tag values.
If multiple search criteria is given, then the search only returns records
that satisfies all criteria. This function returns a dictionary containing
all matching data that are
Parameters:
device – string
Name of device for which the analysis was done. Currently only DIII-D is supported.
shot – int
The shot number to search for.
run_by – string [optional]
The run_by username to search for in IRI records. Production runs are run by the user d3dsf which is the default
tag – string [optional].
The tag to search for in IRI records. There are currently three main tags: ‘CAKE01’ (no MSE), ‘CAKE02’
(with MSE) and ‘CAKE_FDP’. ‘CAKE01’ and ‘CAKE02’ have 50 ms time resolution and 129x129 equilibria. ‘CAKE_FDP’
has 20 ms time resolution and 257x257 equilibria.
settings – string [optional]
The seetings string to search for in IRI records. For example ‘1FWDyrS` is defalut for knot optimized MSE
constrained equilibria. See documentation for code in between_shot_autorun() in the CAKE module for full
details.
ignore_ignore – bool [optional]
If this flag is set, then the ‘ignore’ field in the IRI metadata tables
will be ignored. Thus this function will then return records that have
been marked ‘ignore’. Defaults to False.
Loads IRI results as described in a runs dictionary, and outputs to a OMFITtree()
ex. OMFIT[‘iri_data’] = load_iri_results(runs_dict).
Parameters:
runs_dict – dictionary
Dictionary of metadata describing uploaded IRI data. It should come from
the avaliable_iri_data() function. But editing and deleting of records
is allowed as long as the hierarchy is preserved.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
For testing, please see: samples/sample_tex.tex, samples/sample_bib.bib, samples/placeholder.jpb, and
samples/bib_style_d3dde.bst .
Parameters:
main – string
Filename of the main .tex file that will be compiled.
mainroot – string
Filename without the .tex extension. Leave as None to let this be determined
automatically. Only use this option if you have a confusing filename that breaks the automatic determinator,
like blah.tex.tex (hint: try not to give your files confusing names).
local_build – bool
Deploy pdflatex job locally instead of deploying to remote server specified in SETTINGS
export_path – string
Your LaTeX project can be quickly exported to this path if it is defined. The project can be
automatically exported after build if this is defined. Can be defined later in the settings namelist.
export_after_build – bool
Automatically export after building if export_path is defined. Can be updated later in the settings namelist.
hide_style_files – bool
Put style files in hidden __sty__ sub tree in OMFIT (they still deploy to top level folder on disk)
debug – bool
Some data will be saved to OMFIT[‘scratch’]
filename – ‘directory/bla/OMFITsave.txt’ or ‘directory/bla.zip’ where the OMFITtree will be saved
(if ‘’ it will be saved in the same folder of the parent OMFITtree)
only – list of strings used to load only some of the branches from the tree (eg. [“[‘MainSettings’]”,”[‘myModule’][‘SCRIPTS’]”]
modifyOriginal – by default OMFIT will save a copy and then overwrite previous save only if successful.
If modifyOriginal=True and filename is not .zip, will write data directly at destination,
which will be faster but comes with the risk of deleting a good save if the new save
fails for some reason
readOnly – will place entry in OMFITsave.txt of the parent so that this OMFITtree can be loaded,
but will not save the actual content of this subtree. readOnly=True is meant to be
used only after this subtree is deployed where its fileneme says it will be. Using this
feature could result in much faster projects save if the content of this tree is large.
quiet – Verbosity level
developerMode – load OMFITpython objects within the tree as modifyOriginal
serverPicker – take server/tunnel info from MainSettings[‘SERVER’]
remote – access the filename in the remote directory
server – if specified the file will be downsync from the server
tunnel – access the filename via the tunnel
**kw – Extra keywords are passed to the SortedDict class
Shortcut for opening the output file. If the output file has not been generated yet, a build will be attempted.
This lets the user easily open the output from the top level context menu for the OMFITlatex instance, which is
convenient.
Generate MacSurfS ASCII file containing control surface for Equivalent Surface Current workflow
:param rs: radius (r/a) of control surface picked from CHEASE vacuum mesh
:param saveCarMadata: flag to save MacDataS for CarMa coupling
Get unit vectors e_s and e_chi and jacobian from real space R,Z
:param vacFlag: flag to calculate metric elements in all domain (plasma+vacuum)
:param IIgrid: specify the radial mesh index to calculate vectors within
Calculate normal and tangential components of BPLASMA on given surface (e.g. wall)
Calculate Fourier decomposition of Bn,t,phi
Method used in CarMa Forward Coupling
:param rsurf: normalized radius of surface
:param kdR: flag for alternative calculation of unit vectors (default=False)
Plot perturbed field components b1,b2,b3
The MARS variable “J*b” is shown, not the physical quantity “b”
:param Mmax: upper poloidal harmonic for labeling
:param fig: specify target figure
Plot Bn on surface calculated by get_BatSurf, normalized and reconstructed in (chi,phi) real space
:param fig: specify target figure
:param rsurf: normalized surface radius for plotting, default is plasma boundary
:param n: toroidal mode number for inverse Fourier transorm in toroidal angle
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
OMFITmatrix leverages both xarray and pandas as an efficient way of storing matrices to file.
Internally, the data is stored as an xarray.DataArray under self[‘data’].
Parameters:
filename – path to file.
bin – def None, filetype is unknown,
if True, NetCDF,
if False, ASCII.
zip – def None, compression is unknown,
if False, switched off,
if True, on.
This class provides access to MDSplus value and allows execution of any TDI commands.
The MDSplus value data, dim_of units, error can be accessed by the methods defined in this class.
This class is capable of ssh-tunneling if necessary to reach the MDSplus server.
Tunneling is set based on OMFIT[‘MainSettings’][‘SERVER’]
Parameters:
server – MDSplus server or Tokamak device (e.g. atlas.gat.com or DIII-D)
treename – MDSplus tree (None or string)
shot – MDSplus shot (integer or string)
TDI – TDI command to be executed
quiet – print if no data is found
caching – if False turns off caching system, else behaviour is set by OMFITmdsCache
timeout – int
Timeout setting passed to MDSplus.Connection.get(), in ms. -1 seems to disable timeout.
Only works for newer MDSplus versions, such as 7.84.8.
To resample data on the server side, which can be much faster when dealing with large data and slow connections,
provide t_start, t_end, and dt in the system’s time units (ms or s). For example, to get stored energy from EFIT01
from 0 to 5000 ms at 50 ms intervals (DIII-D uses time units of ms; some other facilities like KSTAR use seconds):
To access DIII-D PTDATA, set treename=’PTDATA’ and use the TDI keyword to pass signal to be retrieved.
Note: PTDATA ical option can be passed by setting the shot keyword as a string with the shot number followed by the ical option separated by comma. e.g. shot=’145419,1’
This class that provides a convenient interface to the OMFITmdsConnectionBaseClass
Specifically it allows specifiying series of commands (mdsvalue, write, write_dataset, tcl, test_connection)
without having to re-type the server for each command.
cachesDir – directory to use for off-line storage of data
If True it defaults to OMFIT[‘MainSettings’][‘SETUP’][‘cachesDir’]
If False, then off-line MDSplus caching is disabled
If a string, then that value is used
limit – limit number of elements for in-memory caching
NOTE: off-line caching can be achieved via:
>> # off-line caching controlled by OMFIT[‘MainSettings’][‘SETUP’][‘cachesDir’]
>> OMFIT[‘MainSettings’][‘SETUP’][‘cachesDir’] = ‘/path/to/where/MDS/cache/data/resides’
>> OMFITmdsCache(cachesDir=True)
>>
>> # off-line caching for this OMFIT session to specific folder
>> OMFITmdsCache(cachesDir=’/path/to/where/MDS/cache/data/resides’)
>>
>> # purge off-line caching (clears directory based on cachesDir)
>> OMFITmdsCache().purge()
>>
>> # disable off-line caching for this OMFIT session
>> OMFITmdsCache(cachesDir=False)
>>
>> # disable default off-line caching
>> OMFIT[‘MainSettings’][‘SETUP’][‘cachesDir’]=False
Interprets a signal like abc * def by making multiple MDSplus calls.
MDSplus might be able to interpret expressions like abc*def by itself,
but sometimes this doesn’t work (such as when the data are really in PTDATA?)
You might also have already cached or want to cache abc and def locally
because you need them for other purposes.
Parameters:
server – string
The name of the MDSplus server to connect to
shot – int
Shot number to query
treename – string or None
Name of the MDSplus tree. None is for connecting to PTDATA at DIII-D.
tdi – string
A pointname or expression containing pointnames
Use ‘testytest’ as a fake pointname to avoid connecting to MDSplus during testing
scratch – dict-like [optional]
Catch intermediate quantities for debugging by providing a dict
Returns:
(array, array, string/None, string/None)
x: independent variable
y: dependent variable as a function of x
units: units of y or None if not found
xunits: units of x or None if not found
Attempts to lookup a list of available EFITs from MDSplus
Works for devices that store EFITs together in a group under a parent tree, such as:
EFIT (parent tree)
|- EFIT01 (results from an EFIT run)
|- EFIT02 (results from another run)
|- EFIT03
|- …
If the device’s MDSplus tree is not arranged like this, it will fail and return [].
Requires a single MDSplus call
Parameters:
scratch_area – dict
Scratch area for storing results to reduce repeat calls. Mainly included to match
call sigure of available_efits_from_rdb(), since OMFITmdsValue already has caching.
device – str
Device name
shot – int
Shot number
list_empty_efits – bool
List all EFITs includeing these without any data
default_snap_list – dict [optional]
Default set of EFIT treenames. Newly discovered ones will be added to the list.
format – str
Instructions for formatting data to make the EFIT tag name.
Provided for compatibility with available_efits_from_rdb() because the only option is ‘{tree}’.
Returns:
(dict, str)
Dictionary keys will be descriptions of the EFITs
Dictionary values will be the formatted identifiers.
For now, the only supported format is just the treename.
If lookup fails, the dictionary will be {‘’: ‘’} or will only contain default results, if any.
String will contain information about the discovered EFITs
VMS dates are ticks since 1858-11-17 00:00:00, where each tick is 100 ns
Unix ticks are seconds since 1970-01-01 00:00:00 GMT or 1969-12-31 16:00:00
This function may be useful because MDSplus dates, at least for KSTAR, are
recorded as VMS timestamps, which are not understood by datetime.
OMFIT class used to load from Multi Mode Model output files
:param filename: filename passed to OMFITascii class
:param **kw: keyword dictionary passed to OMFITascii class
OMFIT class used to interface with FORTRAN namelist files
Parameters:
filename – filename to be parsed
input_string – input string to be parsed (preceeds filename)
nospaceIsComment – whether a line which starts without a space should be retained as a comment. If None, a “smart” guess is attempted
outsideOfNamelistIsComment – whether the content outside of the namelist blocks should be retained as comments. If None, a “smart” guess is attempted
retain_comments – whether comments should be retained or discarded
skip_to_symbol – string to jump to for the parsing. Content before this string is ignored
collect_arrays – whether arrays defined throughout the namelist should be collected into single entries
(e.g. a=5,a(1,4)=0)
multiDepth – wether nested namelists are allowed
bang_comment_symbol – string containing the characters that should be interpreted as comment delimiters.
equals – how the equal sign should be written when saving the namelist
compress_arrays – compress repeated elements in an array by using v*n namelist syntax
max_array_chars – wrap long array lines
explicit_arrays – (True,False,1) whether to place name(1) in front of arrays.
If 1 then (1) is only placed in front of arrays that have only one value.
separator_arrays – characters to use between array elements
split_arrays – write each array element explicitly on a separate line
Specifically this functionality was introduced to split TRANSP arrays
idlInput – whether to interpret the namelist as IDL code
OMFIT class used to interface with FORTRAN namelist files with arrays indexed according to FORTRAN indexing convention
Parameters:
filename – filename to be parsed
input_string – input string to be parsed (preceeds filename)
nospaceIsComment – whether a line which starts without a space should be retained as a comment. If None, a “smart” guess is attempted
outsideOfNamelistIsComment – whether the content outside of the namelist blocks should be retained as comments. If None, a “smart” guess is attempted
retain_comments – whether comments should be retained or discarded
skip_to_symbol – string to jump to for the parsing. Content before this string is ignored
collect_arrays – whether arrays defined throughout the namelist should be collected into single entries
(e.g. a=5,a(1,4)=0)
multiDepth – wether nested namelists are allowed
bang_comment_symbol – string containing the characters that should be interpreted as comment delimiters.
equals – how the equal sign should be written when saving the namelist
compress_arrays – compress repeated elements in an array by using v*n namelist syntax
max_array_chars – wrap long array lines
explicit_arrays – (True,False,1) whether to place name(1) in front of arrays.
If 1 then (1) is only placed in front of arrays that have only one value.
separator_arrays – characters to use between array elements
split_arrays – write each array element explicitly on a separate line
Specifically this functionality was introduced to split TRANSP arrays
idlInput – whether to interpret the namelist as IDL code
OMFIT class used to interface with NETCDF files
This class is based on the netCDF4 library which supports the following file formats:
‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_CLASSIC’, ‘NETCDF3_64BIT’
NOTE: This class constains OMFITncData class objects.
Parameters:
filename – filename passed to OMFITobject class
**kw – keyword dictionary passed to OMFITobject class and the netCDF4.Dataset() method at loading
Class which takes care of converting netCDF variables into SortedDict to be used in OMFIT
OMFITncData object are intended to be contained in OMFITnc objects
Parameters:
variable –
if None then returns
if a netCDF4.Variable object then data is read from that variable
if anything else, this is used as the data
if a string and filename is set then this is the name of the variable to read from the file
dimension –
Not used if variable is a netCDF4.Variable object or None
If None, then the dimension of the variable is automatically set
If not None sets the dimension of the variable
Ignored if filename is set
dtype –
Not used if variable is a netCDF4.Variable object or None
If None, then the data type of the variable is automatically set
If not None sets the data type of the variable
Ignored if filename is set
filename –
this is the filename from which variables will be read
Same class as OMFITncData but this type of object
will not be saved into the NetCDF file. This is useful
if one wants to create “shadow” NetCDF variables into OMFIT
without altering the original NetCDF file.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Function to load the json mapping files (local or remote)
Allows for merging external mapping rules defined by users.
This function sanity-checks and the mapping file and adds extra info required for mapping
Parameters:
machine – machine for which to load the mapping files
branch – GitHub branch from which to load the machine mapping information
user_machine_mappings – Dictionary of mappings that users can pass to this function to temporarily use their mappings
(useful for development and testinig purposes)
return_raw_mappings – Return mappings without following __include__ statements nor resoliving eval2TDI directives
raise_errors – raise errors or simply print warnings if something isn’t right
Dynamically evaluate whether time is homogeneous or not
NOTE: this method does not read ods[‘ids_properties.homogeneous_time’] instead it uses the time info to figure it out
Parameters:
default – what to return in case no time basis is defined
Returns:
True/False or default value (True) if no time basis is defined
Method to access data stored in ODS with no processing of the key, and it is thus faster than the ODS.__getitem__(key)
Effectively behaves like a pure Python dictionary/list __getitem__.
This method is mostly meant to be used in the inner workings of the ODS class.
NOTE: ODS.__getitem__(key, False) can be used to access items in the ODS with disabled cocos and coordinates processing but with support for different syntaxes to access data
Method to assign data to an ODS with no processing of the key, and it is thus faster than the ODS.__setitem__(key, value)
Effectively behaves like a pure Python dictionary/list __setitem__.
This method is mostly meant to be used in the inner workings of the ODS class.
dynamic – whether dynamic loaded key should be shown.
This is True by default because this should be the case for calls that are facing the user.
Within the inner workings of OMAS we thus need to be careful and keep track of when this should not be the case.
Throughout the library we use dynamic=1 or dynamic=0 for debug purposes, since one can place a conditional
breakpoint in this function checking if dynamic is True and self.dynamic to verfy that indeed the dynamic=True
calls come from the user and not from within the library itself.
n – raise an error if a number of occurrences different from n is found
regular_expression_startswith – indicates that use of regular expressions
in the search_pattern is preceeded by certain characters.
This is used internally by some methods of the ODS to force users
to use ‘@’ to indicate access to a path by regular expression.
Returns:
list of ODS locations matching search_pattern pattern
Returns data of an ODS location and correspondnig coordinates as an xarray dataset
Note that the Dataset and the DataArrays have their attributes set with the ODSs structure info
Return xarray.Dataset representation of a whole ODS
Forming the N-D labeled arrays (tensors) that are at the base of xarrays,
requires that the number of elements in the arrays do not change across
the arrays of data structures.
Parameters:
homogeneous –
False: flat representation of the ODS
(data is not collected across arrays of structures)
’time’: collect arrays of structures only along the time dimension
(always valid for homogeneous_time=True)
’full’: collect arrays of structures along all dimensions
(may be valid in many situations, especially related to
simulation data with homogeneous_time=True and where
for example number of ions, sources, etc. do not vary)
None: smart setting, uses homogeneous=’time’ if homogeneous_time=True else False
filename – filename.XXX where the extension is used to select save format method (eg. ‘pkl’,’nc’,’h5’,’ds’,’json’,’ids’)
set to imas, s3, hdc, mongo for load methods that do not have a filename with extension
*args – extra arguments passed to save_omas_XXX() method
**kw – extra keywords passed to save_omas_XXX() method
filename – filename.XXX where the extension is used to select load format method (eg. ‘pkl’,’nc’,’h5’,’ds’,’json’,’ids’)
set to imas, s3, hdc, mongo for save methods that do not have a filename with extension
consistency_check – perform consistency check once the data is loaded
*args – extra arguments passed to load_omas_XXX() method
**kw – extra keywords passed to load_omas_XXX() method
Dynamically load OMAS data for seekable storage formats
Parameters:
filename – filename.XXX where the extension is used to select load format method (eg. ‘nc’,’h5’,’ds’,’json’,’ids’)
set to imas, s3, hdc, mongo for save methods that do not have a filename with extension
consistency_check – perform consistency check once the data is loaded
*args – extra arguments passed to dynamic_omas_XXX() method
**kw – extra keywords passed to dynamic_omas_XXX() method
This function sets currents in ods[‘core_profiles’][‘profiles_1d’][time_index]
If provided currents are inconsistent with each other or ods, ods is not updated and an error is thrown.
Updates integrated currents in ods[‘core_profiles’][‘global_quantities’]
(N.B.: equilibrium IDS is required for evaluating j_tor and integrated currents)
Parameters:
ods – ODS to update in-place
time_index – ODS time index to updated
if None, all times are updated
Parameters:
rho_tor_norm – normalized rho grid upon which each j is given
For each j:
ndarray: set in ods if consistent
‘default’: use value in ods if present, else set to None
None: try to calculate from currents; delete from ods if you can’t
Parameters:
j_actuator – Non-inductive, non-bootstrap current <J.B>/B0
N.B.: used for calculating other currents and consistency, but not set in ods
j_bootstrap – Bootstrap component of <J.B>/B0
j_ohmic – Ohmic component of <J.B>/B0
j_non_inductive – Non-inductive component of <J.B>/B0
Consistency requires j_non_inductive = j_actuator + j_bootstrap, either
as explicitly provided or as computed from other components.
j_total – Total <J.B>/B0
Consistency requires j_total = j_ohmic + j_non_inductive either as
explicitly provided or as computed from other components.
This function sets the currents in ods[‘core_profiles’][‘profiles_1d’][time_index]
using ods[‘equilibrium’][‘time_slice’][time_index][‘profiles_1d’][‘j_tor’]
This function derives values of empty fields in prpfiles_2d from other parameters in the equilibrium ods
Currently only the magnetic field components are supported
averages – dictionary with average times for individual constraints
Smoothed using Gaussian, sigma=averages/4. and the convolution is integrated across +/-4.*sigma.
cutoff_hz – a list of two elements with low and high cutoff frequencies [lowFreq, highFreq]
rm_integr_drift_after – time in ms after which is assumed thet all currents are zero and signal should be equal to zero. Used for removing of the integrators drift
This routines creates interpolators for quantities and stores them in the cache for future use.
It can also be used to just return the current profile_2d quantity by omitting dim1 and dim2.
At the moment this routine always extrapolates for data outside the defined grid range.
ax – axes instance into which to plot (default: gca())
reset_fan_color – bool
At the start of each bolometer fan (group of channels), set color to None to let a new one be picked by the
cycler. This will override manually specified color.
colors – list of matplotlib color specifications. Do not use a single RGBA style spec.
**kw –
Additional keywords for bolometer plot
Accepts standard omas_plot overlay keywords listed in overlay() documentation: mask, labelevery, …
Remaining keywords are passed to plot call for drawing lines for the bolometer sightlines
ax – axes instance into which to plot (default: gca())
angle_not_in_pipe_name – bool
Set this to include (Angle) at the end of injector labels. Useful if injector/pipe names don’t already
include angles in them.
which_gas –
string or list
Filter for selecting which gas pipes to display.
If string: get a preset group, like ‘all’.
If list: only pipes in the list will be shown. Abbreviations are tolerated; e.g. GASA is recognized as
GASA_300. One abbreviation can turn on several pipes. There are several injection location names
starting with RF_ on DIII-D, for example.
show_all_pipes_in_group – bool
Some pipes have the same R,Z coordinates of their exit positions (but different phi locations) and will
appear at the same location on the plot. If this keyword is True, labels for all the pipes in such a group
will be displayed together. If it is False, only the first one in the group will be labeled.
simple_labels – bool
Simplify labels by removing suffix after the last underscore.
label_spacer – int
Number of blank lines and spaces to insert between labels and symbol
colors – list of matplotlib color specifications.
These colors control the display of various gas ports. The list will be repeated to make sure it is long enough.
Do not specify a single RGB tuple by itself. However, a single tuple inside list is okay [(0.9, 0, 0, 0.9)].
If the color keyword is used (See **kw), then color will be popped to set the default for colors in case colors
is None.
draw_arrow – bool or dict
Draw an arrow toward the machine at the location of the gas inlet. If dict, pass keywords to arrow drawing func.
**kw –
Additional keywords for gas plot:
Accepts standard omas_plot overlay keywords listed in overlay() documentation: mask, labelevery, …
Remaining keywords are passed to plot call for drawing markers at the gas locations.
ods – ODS instance
Must contain langmuir_probes with embedded position data
ax – Axes instance
embedded_probes – list of strings
Specify probe names to use. Only the embedded probes listed will be plotted. Set to None to plot all probes.
Probe names are like ‘F11’ or ‘P-6’ (the same as appear on the overlay).
colors – list of matplotlib color specifications. Do not use a single RGBA style spec.
show_embedded – bool
Recommended: don’t enable both embedded and reciprocating plots at the same time; make two calls instead.
It will be easier to handle mapping of masks, colors, etc.
show_reciprocating – bool
**kw –
Additional keywords.
Accepts standard omas_plot overlay keywords listed in overlay() documentation: mask, labelevery, …
Others will be passed to the plot() call for drawing the probes.
Plots overlays of hardware/diagnostic locations on a tokamak cross section plot
Parameters:
ods – OMAS ODS instance
ax – axes instance into which to plot (default: gca())
allow_autoscale – bool
Certain overlays will be allowed to unlock xlim and ylim, assuming that they have been locked by equilibrium_CX.
If this option is disabled, then hardware systems like PF-coils will be off the plot and mostly invisible.
debug_all_plots – bool
Individual hardware systems are on by default instead of off by default.
return_overlay_list – Return list of possible overlays that could be plotted
**kw –
additional keywords for selecting plots.
Select plots by setting their names to True; e.g.: if you want the gas_injection plot, set gas_injection=True
as a keyword.
If debug_all_plots is True, then you can turn off individual plots by, for example, set_gas_injection=False.
Instead of True to simply turn on an overlay, you can pass a dict of keywords to pass to a particular overlay
method, as in thomson={‘labelevery’: 5}. After an overlay pops off its keywords, remaining keywords are passed
to plot, so you can set linestyle, color, etc.
Overlay functions accept these standard keywords:
mask: bool array
Set of flags for switching plot elements on/off. Must be equal to the number of channels or items to be
plotted.
labelevery: int
Sets how often to add labels to the plot. A setting of 0 disables labels, 1 labels every element,
2 labels every other element, 3 labels every third element, etc.
notesize: matplotlib font size specification
Applies to annotations drawn on the plot. Examples: ‘xx-small’, ‘medium’, 16
label_ha: None or string or list of (None or string) instances
Descriptions of how labels should be aligned horizontally. Either provide a single specification or a
list of specs matching or exceeding the number of labels expected.
Each spec should be: ‘right’, ‘left’, or ‘center’. None (either as a scalar or an item in the list) will
give default alignment for the affected item(s).
label_va: None or string or list of (None or string) instances
Descriptions of how labels should be aligned vertically. Either provide a single specification or a
list of specs matching or exceeding the number of labels expected.
Each spec should be: ‘top’, ‘bottom’, ‘center’, ‘baseline’, or ‘center_baseline’.
None (either as a scalar or an item in the list) will give default alignment for the affected item(s).
label_r_shift: float or float array/list.
Add an offset to the R coordinates of all text labels for the current hardware system.
(in data units, which would normally be m)
Scalar: add the same offset to all labels.
Iterable: Each label can have its own offset.
If the list/array of offsets is too short, it will be padded with 0s.
label_z_shift: float or float array/list
Add an offset to the Z coordinates of all text labels for the current hardware system
(in data units, which would normally be m)
Scalar: add the same offset to all labels.
Iterable: Each label can have its own offset.
If the list/array of offsets is too short, it will be padded with 0s.
Additional keywords are passed to the function that does the drawing; usually matplotlib.axes.Axes.plot().
Add sample equilibrium data
This method operates in in-place.
Parameters:
ods – ODS instance
time_index – int
Under which time index should fake equilibrium data be loaded?
include_profiles – bool
Include 1D profiles of pressure, q, p’, FF’
They are in the sample set, so not including them means deleting them.
include_phi – bool
Include 1D and 2D profiles of phi (toroidal flux, for calculating rho)
This is in the sample set, so not including it means deleting it.
include_psi – bool
Include 1D and 2D profiles of psi (poloidal flux)
This is in the sample set, so not including it means deleting it.
include_wall – bool
Include the first wall
This is in the sample set, so not including it means deleting it.
include_q – bool
Include safety factor
This is in the sample set, so not including it means deleting it.
include_xpoint – bool
Include X-point R-Z coordinates
This is not in the sample set, so including it means making it up
dynamic – whether dynamic loaded key should be shown.
This is True by default because this should be the case for calls that are facing the user.
Within the inner workings of OMAS we thus need to be careful and keep track of when this should not be the case.
Throughout the library we use dynamic=1 or dynamic=0 for debug purposes, since one can place a conditional
breakpoint in this function checking if dynamic is True and self.dynamic to verfy that indeed the dynamic=True
calls come from the user and not from within the library itself.
filename – filename.XXX where the extension is used to select save format method (eg. ‘pkl’,’nc’,’h5’,’ds’,’json’,’ids’)
set to imas, s3, hdc, mongo for load methods that do not have a filename with extension
*args – extra arguments passed to save_omas_XXX() method
**kw – extra keywords passed to save_omas_XXX() method
filename – filename.XXX where the extension is used to select load format method (eg. ‘pkl’,’nc’,’h5’,’ds’,’json’,’ids’)
set to imas, s3, hdc, mongo for save methods that do not have a filename with extension
consistency_check – perform consistency check once the data is loaded
*args – extra arguments passed to load_omas_XXX() method
**kw – extra keywords passed to load_omas_XXX() method
Method to access data to CodeParameters with no processing of the key.
Effectively behaves like a pure Python dictionary/list __getitem__.
This method is mostly meant to be used in the inner workings of the CodeParameters class.
Method to assign data to CodeParameters with no processing of the key.
Effectively behaves like a pure Python dictionary/list __setitem__.
This method is mostly meant to be used in the inner workings of the CodeParameters class.
Checks if two ODSs have any difference and returns the string with the cause of the different
Parameters:
ods1 – first ods to check
ods2 – second ods to check
ignore_type – ignore object type differences
ignore_empty – ignore emptry nodes
ignore_keys – ignore the following keys
ignore_default_keys – ignores the following keys from the comparison
dataset_description.data_entry.user
dataset_description.data_entry.run
dataset_description.data_entry.machine
dataset_description.ids_properties
dataset_description.imas_version
dataset_description.time
ids_properties.homogeneous_time
ids_properties.occurrence
ids_properties.version_put.data_dictionary
ids_properties.version_put.access_layer
ids_properties.version_put.access_layer_language
rtol : The relative tolerance parameter
atol : The absolute tolerance parameter
Returns:
string with reason for difference, or False otherwise
objects_encode – how to handle non-standard JSON objects
* True: encode numpy arrays, complex, and uncertain
* None: numpy arrays as lists, encode complex, and uncertain
* False: numpy arrays as lists, fail on complex, and uncertain
Function to get machines that have their mappings defined
This function takes care of remote transfer the needed files (both .json and .py) if a remote branch is requested
Parameters:
machine – string with machine name or None
branch – GitHub branch from which to load the machine mapping information
Returns:
if machine==None returns dictionary with list of machines and their json mapping files
if machine is a string, then returns json mapping filename
Transform r,z,a,l arrays commonly used to describe poloidal magnetic
probes geometry to actual r,z coordinates of the end-points of the probes.
This is useful for plotting purposes.
Parameters:
r0 – r coordinates [m]
z0 – Z coordinates [m]
a0 – poloidal angles [radiants]
l0 – lenght [m]
cocos – cocos convention
Returns:
list of 2-points r and z coordinates of individual probes
Given <Jt/R> returns <J.B>, or vice versa
Transformation obeys <J.B> = (1/f)*(<B^2>/<1/R^2>)*(<Jt/R> + dp/dpsi*(1 - f^2*<1/R^2>/<B^2>))
N.B. input current must be in the same COCOS as equilibrium.cocosio
Parameters:
rho – normalized rho grid for input JtoR or JparB
JtoR – input <Jt/R> profile (cannot be set along with JparB)
JparB – input <J.B> profile (cannot be set along with JtoR)
equilibrium – equilibrium.time_slice[:] ODS containing quanities needed for transformation
includes_bootstrap – set to True if input current includes bootstrap
Returns:
<Jt/R> if JparB set or <J.B> if JtoR set
Example: given total <Jt/R> on rho grid with an existing ods, return <J.B>
search for the index in an array structure that matches some conditions
Parameters:
ods – ODS location that is an array of structures
conditions – dictionary (or ODS) whith entries that must match and their values
* condition[‘name’]=value : check value
* condition[‘name’]=True : check existance
* condition[‘name’]=False : check not existance
NOTE: True/False as flags for (not)existance is not an issue since IMAS does not support booleans
no_matches_return – what index to return if no matches are found
no_matches_raise_error – wheter to raise an error in no matches are found
multiple_matches_raise_error – whater to raise an error if multiple matches are found
Returns normalizing scale for a physical quantity.
E.g. “temprerature” returns 1.e-3 and keV
:param phys_qaunt: str with a physical quantity. Uses IMAS scheme names where possible
:return: scale, unit
omfit_classes.omfit_omas.identify_cocos(B0, Ip, q, psi, clockwise_phi=None, a=None)[source]¶
Utility function to identify COCOS coordinate system
If multiple COCOS are possible, then all are returned.
Parameters:
B0 – toroidal magnetic field (with sign)
Ip – plasma current (with sign)
q – safety factor profile (with sign) as function of psi
psi – poloidal flux as function of psi(with sign)
clockwise_phi – (optional) [True, False] if phi angle is defined clockwise or not
This is required to identify odd Vs even COCOS
Note that this cannot be determined from the output of a code.
An easy way to determine this is to answer the question: is positive B0 clockwise?
a – (optional) flux surfaces minor radius as function of psi
This is required to identify 2*pi term in psi definition
Provides environment for data input/output to/from OMAS
Parameters:
ods – ODS on which to operate
cocosio – COCOS convention
coordsio – dictionary/ODS with coordinates for data interpolation
unitsio – True/False whether data read from OMAS should have units
uncertainio – True/False whether data read from OMAS should have uncertainties
input_data_process_functions – list of functions that are used to process data that is passed to the ODS
xmlcodeparams – view code.parameters as an XML string while in this environment
dynamic_path_creation – whether to dynamically create the path when setting an item
* False: raise an error when trying to access a structure element that does not exists
* True (default): arrays of structures can be incrementally extended by accessing at the next element in the array
* ‘dynamic_array_structures’: arrays of structures can be dynamically extended
kw – extra keywords set attributes of the ods (eg. ‘consistency_check’, ‘dynamic_path_creation’, ‘imas_version’)
conditions – dictionary with conditions for returning a match. For example:
{‘List of IDSs’:[‘equilibrium’,’core_profiles’,’core_sources’,’summary’], ‘Workflow’:’CORSICA’, ‘Fuelling’:’D-T’}
squash – remove attributes that are equal among all entries
Returns:
OMFITiterscenario dictionary only with matching entries
filename – filename.XXX where the extension is used to select load format method (eg. ‘pkl’,’nc’,’h5’,’ds’,’json’,’ids’)
set to imas, s3, hdc, mongo for save methods that do not have a filename with extension
consistency_check – perform consistency check once the data is loaded
*args – extra arguments passed to load_omas_XXX() method
**kw – extra keywords passed to load_omas_XXX() method
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Return xarray.Dataset representation of a whole ODS
Forming the N-D labeled arrays (tensors) that are at the base of xarrays,
requires that the number of elements in the arrays do not change across
the arrays of data structures.
Parameters:
homogeneous –
False: flat representation of the ODS
(data is not collected across arrays of structures)
’time’: collect arrays of structures only along the time dimension
(always valid for homogeneous_time=True)
’full’: collect arrays of structures along all dimensions
(may be valid in many situations, especially related to
simulation data with homogeneous_time=True and where
for example number of ions, sources, etc. do not vary)
None: smart setting, uses homogeneous=’time’ if homogeneous_time=True else False
Method to access data stored in ODS with no processing of the key, and it is thus faster than the ODS.__getitem__(key)
Effectively behaves like a pure Python dictionary/list __getitem__.
This method is mostly meant to be used in the inner workings of the ODS class.
NOTE: ODS.__getitem__(key, False) can be used to access items in the ODS with disabled cocos and coordinates processing but with support for different syntaxes to access data
Dynamically evaluate whether time is homogeneous or not
NOTE: this method does not read ods[‘ids_properties.homogeneous_time’] instead it uses the time info to figure it out
Parameters:
default – what to return in case no time basis is defined
Returns:
True/False or default value (True) if no time basis is defined
dynamic – whether dynamic loaded key should be shown.
This is True by default because this should be the case for calls that are facing the user.
Within the inner workings of OMAS we thus need to be careful and keep track of when this should not be the case.
Throughout the library we use dynamic=1 or dynamic=0 for debug purposes, since one can place a conditional
breakpoint in this function checking if dynamic is True and self.dynamic to verfy that indeed the dynamic=True
calls come from the user and not from within the library itself.
Dynamically load OMAS data for seekable storage formats
Parameters:
filename – filename.XXX where the extension is used to select load format method (eg. ‘nc’,’h5’,’ds’,’json’,’ids’)
set to imas, s3, hdc, mongo for save methods that do not have a filename with extension
consistency_check – perform consistency check once the data is loaded
*args – extra arguments passed to dynamic_omas_XXX() method
**kw – extra keywords passed to dynamic_omas_XXX() method
This function sets currents in ods[‘core_profiles’][‘profiles_1d’][time_index]
If provided currents are inconsistent with each other or ods, ods is not updated and an error is thrown.
Updates integrated currents in ods[‘core_profiles’][‘global_quantities’]
(N.B.: equilibrium IDS is required for evaluating j_tor and integrated currents)
Parameters:
ods – ODS to update in-place
time_index – ODS time index to updated
if None, all times are updated
Parameters:
rho_tor_norm – normalized rho grid upon which each j is given
For each j:
ndarray: set in ods if consistent
‘default’: use value in ods if present, else set to None
None: try to calculate from currents; delete from ods if you can’t
Parameters:
j_actuator – Non-inductive, non-bootstrap current <J.B>/B0
N.B.: used for calculating other currents and consistency, but not set in ods
j_bootstrap – Bootstrap component of <J.B>/B0
j_ohmic – Ohmic component of <J.B>/B0
j_non_inductive – Non-inductive component of <J.B>/B0
Consistency requires j_non_inductive = j_actuator + j_bootstrap, either
as explicitly provided or as computed from other components.
j_total – Total <J.B>/B0
Consistency requires j_total = j_ohmic + j_non_inductive either as
explicitly provided or as computed from other components.
This function sets the currents in ods[‘core_profiles’][‘profiles_1d’][time_index]
using ods[‘equilibrium’][‘time_slice’][time_index][‘profiles_1d’][‘j_tor’]
This function derives values of empty fields in prpfiles_2d from other parameters in the equilibrium ods
Currently only the magnetic field components are supported
averages – dictionary with average times for individual constraints
Smoothed using Gaussian, sigma=averages/4. and the convolution is integrated across +/-4.*sigma.
cutoff_hz – a list of two elements with low and high cutoff frequencies [lowFreq, highFreq]
rm_integr_drift_after – time in ms after which is assumed thet all currents are zero and signal should be equal to zero. Used for removing of the integrators drift
This routines creates interpolators for quantities and stores them in the cache for future use.
It can also be used to just return the current profile_2d quantity by omitting dim1 and dim2.
At the moment this routine always extrapolates for data outside the defined grid range.
ax – axes instance into which to plot (default: gca())
reset_fan_color – bool
At the start of each bolometer fan (group of channels), set color to None to let a new one be picked by the
cycler. This will override manually specified color.
colors – list of matplotlib color specifications. Do not use a single RGBA style spec.
**kw –
Additional keywords for bolometer plot
Accepts standard omas_plot overlay keywords listed in overlay() documentation: mask, labelevery, …
Remaining keywords are passed to plot call for drawing lines for the bolometer sightlines
ax – axes instance into which to plot (default: gca())
angle_not_in_pipe_name – bool
Set this to include (Angle) at the end of injector labels. Useful if injector/pipe names don’t already
include angles in them.
which_gas –
string or list
Filter for selecting which gas pipes to display.
If string: get a preset group, like ‘all’.
If list: only pipes in the list will be shown. Abbreviations are tolerated; e.g. GASA is recognized as
GASA_300. One abbreviation can turn on several pipes. There are several injection location names
starting with RF_ on DIII-D, for example.
show_all_pipes_in_group – bool
Some pipes have the same R,Z coordinates of their exit positions (but different phi locations) and will
appear at the same location on the plot. If this keyword is True, labels for all the pipes in such a group
will be displayed together. If it is False, only the first one in the group will be labeled.
simple_labels – bool
Simplify labels by removing suffix after the last underscore.
label_spacer – int
Number of blank lines and spaces to insert between labels and symbol
colors – list of matplotlib color specifications.
These colors control the display of various gas ports. The list will be repeated to make sure it is long enough.
Do not specify a single RGB tuple by itself. However, a single tuple inside list is okay [(0.9, 0, 0, 0.9)].
If the color keyword is used (See **kw), then color will be popped to set the default for colors in case colors
is None.
draw_arrow – bool or dict
Draw an arrow toward the machine at the location of the gas inlet. If dict, pass keywords to arrow drawing func.
**kw –
Additional keywords for gas plot:
Accepts standard omas_plot overlay keywords listed in overlay() documentation: mask, labelevery, …
Remaining keywords are passed to plot call for drawing markers at the gas locations.
ods – ODS instance
Must contain langmuir_probes with embedded position data
ax – Axes instance
embedded_probes – list of strings
Specify probe names to use. Only the embedded probes listed will be plotted. Set to None to plot all probes.
Probe names are like ‘F11’ or ‘P-6’ (the same as appear on the overlay).
colors – list of matplotlib color specifications. Do not use a single RGBA style spec.
show_embedded – bool
Recommended: don’t enable both embedded and reciprocating plots at the same time; make two calls instead.
It will be easier to handle mapping of masks, colors, etc.
show_reciprocating – bool
**kw –
Additional keywords.
Accepts standard omas_plot overlay keywords listed in overlay() documentation: mask, labelevery, …
Others will be passed to the plot() call for drawing the probes.
Plots overlays of hardware/diagnostic locations on a tokamak cross section plot
Parameters:
ods – OMAS ODS instance
ax – axes instance into which to plot (default: gca())
allow_autoscale – bool
Certain overlays will be allowed to unlock xlim and ylim, assuming that they have been locked by equilibrium_CX.
If this option is disabled, then hardware systems like PF-coils will be off the plot and mostly invisible.
debug_all_plots – bool
Individual hardware systems are on by default instead of off by default.
return_overlay_list – Return list of possible overlays that could be plotted
**kw –
additional keywords for selecting plots.
Select plots by setting their names to True; e.g.: if you want the gas_injection plot, set gas_injection=True
as a keyword.
If debug_all_plots is True, then you can turn off individual plots by, for example, set_gas_injection=False.
Instead of True to simply turn on an overlay, you can pass a dict of keywords to pass to a particular overlay
method, as in thomson={‘labelevery’: 5}. After an overlay pops off its keywords, remaining keywords are passed
to plot, so you can set linestyle, color, etc.
Overlay functions accept these standard keywords:
mask: bool array
Set of flags for switching plot elements on/off. Must be equal to the number of channels or items to be
plotted.
labelevery: int
Sets how often to add labels to the plot. A setting of 0 disables labels, 1 labels every element,
2 labels every other element, 3 labels every third element, etc.
notesize: matplotlib font size specification
Applies to annotations drawn on the plot. Examples: ‘xx-small’, ‘medium’, 16
label_ha: None or string or list of (None or string) instances
Descriptions of how labels should be aligned horizontally. Either provide a single specification or a
list of specs matching or exceeding the number of labels expected.
Each spec should be: ‘right’, ‘left’, or ‘center’. None (either as a scalar or an item in the list) will
give default alignment for the affected item(s).
label_va: None or string or list of (None or string) instances
Descriptions of how labels should be aligned vertically. Either provide a single specification or a
list of specs matching or exceeding the number of labels expected.
Each spec should be: ‘top’, ‘bottom’, ‘center’, ‘baseline’, or ‘center_baseline’.
None (either as a scalar or an item in the list) will give default alignment for the affected item(s).
label_r_shift: float or float array/list.
Add an offset to the R coordinates of all text labels for the current hardware system.
(in data units, which would normally be m)
Scalar: add the same offset to all labels.
Iterable: Each label can have its own offset.
If the list/array of offsets is too short, it will be padded with 0s.
label_z_shift: float or float array/list
Add an offset to the Z coordinates of all text labels for the current hardware system
(in data units, which would normally be m)
Scalar: add the same offset to all labels.
Iterable: Each label can have its own offset.
If the list/array of offsets is too short, it will be padded with 0s.
Additional keywords are passed to the function that does the drawing; usually matplotlib.axes.Axes.plot().
Add sample equilibrium data
This method operates in in-place.
Parameters:
ods – ODS instance
time_index – int
Under which time index should fake equilibrium data be loaded?
include_profiles – bool
Include 1D profiles of pressure, q, p’, FF’
They are in the sample set, so not including them means deleting them.
include_phi – bool
Include 1D and 2D profiles of phi (toroidal flux, for calculating rho)
This is in the sample set, so not including it means deleting it.
include_psi – bool
Include 1D and 2D profiles of psi (poloidal flux)
This is in the sample set, so not including it means deleting it.
include_wall – bool
Include the first wall
This is in the sample set, so not including it means deleting it.
include_q – bool
Include safety factor
This is in the sample set, so not including it means deleting it.
include_xpoint – bool
Include X-point R-Z coordinates
This is not in the sample set, so including it means making it up
n – raise an error if a number of occurrences different from n is found
regular_expression_startswith – indicates that use of regular expressions
in the search_pattern is preceeded by certain characters.
This is used internally by some methods of the ODS to force users
to use ‘@’ to indicate access to a path by regular expression.
Returns:
list of ODS locations matching search_pattern pattern
Method to assign data to an ODS with no processing of the key, and it is thus faster than the ODS.__setitem__(key, value)
Effectively behaves like a pure Python dictionary/list __setitem__.
This method is mostly meant to be used in the inner workings of the ODS class.
Returns data of an ODS location and correspondnig coordinates as an xarray dataset
Note that the Dataset and the DataArrays have their attributes set with the ODSs structure info
R and Z are from the tips of the arrows in puff_loc.pro; phi from angle listed in labels in puff_loc.pro .
I recorded the directions of the arrows on the EFITviewer overlay, but I don’t know how to include them in IMAS, so
I commented them out.
Warning: changes to gas injector configuration with time are not yet included. This is just the best picture I could
make of the 2018 configuration.
Data sources:
EFITVIEWER: iris:/fusion/usc/src/idl/efitview/diagnoses/DIII-D/puff_loc.pro accessed 2018 June 05, revised 20090317
DIII-D webpage: https://diii-d.gat.com/diii-d/Gas_Schematic accessed 2018 June 05
DIII-D wegpage: https://diii-d.gat.com/diii-d/Gas_PuffLocations accessed 2018 June 05
Updated 2018 June 05 by David Eldon
Returns:
dict
Information or instructions for follow up in central hardware description setup
Serves LP functions by identifying active probes (those that have actual data saved) for a given shot
Sorry, I couldn’t figure out how to do this with a server-side loop over all
the probes, so we have to loop MDS calls on the client side. At least I
resampled to speed that part up.
This could be a lot faster if I could figure out how to make the GETNCI
commands work on records of array signals.
Parameters:
shot – int
allowed_probes – int array
Restrict the search to a certain range of probe numbers to speed things up
These are the numbers of storage trees in MDSplus, not the physical probe numbers
Downloads LP probe data from MDSplus and loads them to the ODS
Parameters:
ods – ODS instance
shot – int
probes – int array-like [optional]
Integer array of DIII-D probe numbers.
If not provided, find_active_d3d_probes() will be used.
allowed_probes – int array-like [optional]
Passed to find_active_d3d_probes(), if applicable.
Improves speed by limiting search to a specific range of probe numbers.
tstart – float
Time to start resample (s)
tend – float
Time to end resample (s)
Set to <= tstart to disable resample
Server-side resampling does not work when time does not increase
monotonically, which is a typical problem for DIII-D data. Resampling is
not recommended for DIII-D.
dt – float
Resample interval (s)
Set to 0 to disable resample
Server-side resampling does not work when time does not increase
monotonically, which is a typical problem for DIII-D data. Resampling is
not recommended for DIII-D.
overwrite – bool
Download and write data even if they already are present in the ODS.
quantities – list of strings [optional]
List of quantities to gather. None to gather all available. Options are:
ion_saturation_current, heat_flux_parallel, n_e, t_e’, surface_area_effective,
v_floating, and b_field_angle
Returns:
ODS instance
The data are added in-place, so catching the return is probably unnecessary.
Serves LP functions by identifying active probes (those that have actual data saved) for a given shot
Sorry, I couldn’t figure out how to do this with a server-side loop over all
the probes, so we have to loop MDS calls on the client side. At least I
resampled to speed that part up.
This could be a lot faster if I could figure out how to make the GETNCI
commands work on records of array signals.
Parameters:
shot – int
allowed_probes – int array
Restrict the search to a certain range of probe numbers to speed things up
Downloads LP probe data from MDSplus and loads them to the ODS
Parameters:
ods – ODS instance
shot – int
probes – int array-like [optional]
Integer array of KSTAR probe numbers.
If not provided, find_active_kstar_probes() will be used.
allowed_probes – int array-like [optional]
Passed to find_active_kstar_probes(), if applicable.
Improves speed by limiting search to a specific range of probe numbers.
tstart – float
Time to start resample (s)
tend – float
Time to end resample (s)
Set to <= tstart to disable resample
dt – float
Resample interval (s)
Set to 0 to disable resample
overwrite – bool
Download and write data even if they already are present in the ODS.
quantities – list of strings [optional]
List of quantities to gather. None to gather all available. Options are:
ion_saturation_current
Since KSTAR has only one option at the moment, this keyword is ignored,
but is accepted to provide a consistent call signature compared to similar
functions for other devices.
Returns:
ODS instance
The data are added in-place, so catching the return is probably unnecessary.
Sets up an OMAS ODS so that it can power a cross section view with diagnostic/hardware overlays. This involves
looking up and writing locations of various hardware to the ODS.
Parameters:
device – string
Which tokamak?
pulse – int
Which pulse number? Used for gathering equilibrium and for looking up hardware configuration as it
could vary with time.
time – int or float array [optional]
Time (ms) within the pulse, used for looking up equilibrium only
If None and no gEQDSKs, try to get all times from MDSplus
efitid – string
EFIT SNAP file or ID tag in MDS plus: used for looking up equilibrium only
geqdsk – OMFITgeqdsk instance or dict-like containing OMFITgeqdsk instance(s) (optional)
Provides EFIT instead of lookup using device, pulse, time, efitid. efitid will be ignored completely. device
and pulse will still be used to look up hardware configuration. time might be used. Providing inconsistent
data may produce confusing plots.
aeqdsk – OMFITaeqdsk instance or dict-like containing OMFITaeqdsk instance(s) (optional)
Provides an option to load aeqdsk data to OMAS. Requirements:
- geqdsk(s) are being used as the source for basic data
- aeqdsk shot and all aeqdsk times match geqdsk shot and times exactly
- OMFITaeqdsk has a to_omas() method (not implemented yet as of 2021-11-12)
meqdsk – OMFITmeqdsk instance or dict-like containing OMFITmeqdsk instance(s) (optional)
Provides an option to load meqdsk data to OMAS. Requirements:
- geqdsk(s) are being used as the source for basic data
- meqdsk shot and all meqdsk times match geqdsk shot and times exactly
keqdsk – OMFITkeqdsk instance or dict-like containing OMFITkeqdsk instance(s) (optional)
Provides an option to load meqdsk data to OMAS. Requirements:
- geqdsk(s) are being used as the source for basic data
- keqdsk shot and all keqdsk times match geqdsk shot and times exactly
overwrite – bool
Flag indicating whether it is okay to overwrite locations if they already exist in ODS
default_load – bool
Default action to take for loading a system. For example, **kw lets you explicitly set gas_injection=False to
prevent calling setup_gas_injection_hardware_description. But for systems which aren’t specified, the default
action (True/False to load/not load) is controlled by this parameter.
minimal_eq_data – bool
Skip loading all the equilibrium data needed to recreate GEQDSK files and only get what’s needed for plots.
no_empty – bool
Filter out equilibrium time-slices that have 0 current or 0 boundary outline points.
(this is passed to multi_efit_to_omas())
**kw – keywords dictionary
Disable gathering/setup of data for individual hardware systems by setting them to False using their names
within IMAS.
For example: gas_injection=False will prevent the call to setup_gas_injection_hardware_description().
Returns:
OMAS ODS instance containing the data you need to draw a cross section w/ diagnostic/hardware overlays.
ods – ODS instance
A New ODS will be created if this is None
minimal – bool
Only gather and add enough data to run a cross-section plot
aeqdsk_time_diff_tol – float
Time difference in ms to allow between GEQDSK and AEQDSK time bases, in case they don’t match exactly.
GEQDSK slices where the closest AEQDSK time are farther away than this will not have AEQDSK data.
no_empty – bool
Remove empty GEQDSK slices from the result.
Sometimes EFITs are loaded with a few invalid/empty slices (0 current,
no boundary, all psi points = 0, etc.). This option will discard those slices
before loading results into an ODS.
**kw – Additional keywords to read_basic_eq_from_mds()
But not the quantities to gather as those are set explicitly in this function.
Returns:
ODS instance
The edit is done in-place, so you don’t have to catch the output if you supply the input ODS
On fail: empty ODS is returned
Transfers poloidal field coil geometry data from a standard format used by efitviewer to ODS.
WARNING: only rudimentary identifies are assigned.
You should assign your own identifiers and only rely on this function to assign numerical geometry data.
Parameters:
ods – ODS instance
Data will be added in-place
coil_data – 2d array
coil_data[i] corresponds to coil i. The columns are R (m), Z (m), dR (m), dZ (m), tilt1 (deg), and tilt2 (deg)
This should work if you just copy data from iris:/fusion/usc/src/idl/efitview/diagnoses/<device>/coils.dat
(the filenames for the coils vary)
Generates an ODS with some minimal test data, including a sample equilibrium and at least one hardware system
:param device: string
:param shot: int
:param efitid: string
:return: ODS
Gathers shape control points describing targets for the plasma boundary to intersect and returns them
Parameters:
dev – string
sh – int
times – float array [optional]
Times to use. If provided, interpolate data. If not, auto-assign using first valid segment.
debug – bool
debug_out – dict-like [optional]
Returns:
tuple
(
t: 1d float array (nt): times in ms (actual times used)
r: 2d float array (nt * nseg): R coordinate in m of shape control points. Unused points are filled with NaN.
z: 2d float array (nt * nseg): Z coordinate in m of shape control points
rx: 2d float array (nt * 2): R coordinate in m of X points (2nd one may be NaN)
rz: 2d float array (nt * 2): Z of X-points. Order is [bottom, top]
rptnames: list of strings (nseg) giving labels for describing ctrl pts
zptnames: list of strings (nseg) giving labels for describing ctrl pts
list of 1D bool arrays vs. t giving validity of outer bottom, inner bottom, outer top, inner top strike pts
Identify a specific strike point and get its coordinates
It’s easy to just pick a strike point, or even find inner-upper or outer-lower, but
identifying primary-outer or primary-inner is a little harder to do. It’s not clear
that there’s any consistent ordering of strike points that would trivialize this
process, so this function exists to do it for you.
Parameters:
ods – ODS instance
in_out – str
‘outer’: try to get an outer strike point (default)
‘inner’: try to get an inner strike point
pri_sec – str
‘primary’: try to get a strike point connected to the primary X-point (default)
‘secondary’: try to get a strike point connected with a secondary X-point
Calculates the orthogonal distance from some point(s) to the LCFS
Works by stepping along the steepest gradient until the LCFS is reached.
Parameters:
ods – ODS instance
A complete, top level ODS, or selection of time slices from equilibrium.
That is, ods[‘equilibrium’][‘time_slice’], if ods is a top level ODS.
r0 – 1D float array
R coordinates of the point(s) in meters.
If grid=False, the arrays must have length matching the number of time slices in equilibrium.
z0 – 1D float array
Z coordinates of the point(s) in meters.
Length must match r0
grid – bool
Return coordinates on a time-space grid, assuming each point in R-Z is static and given a separate history.
Otherwise (grid=False), assume one point is given & it moves in time, so return array will be 1D vs. time.
time_range – 2 element numeric iterable [optional]
Time range in seconds to use for filtering time_slice in ODS
zf – float
Zoom factor for upsampling the equilibrium first to improve accuracy. Ignored unless > 1.
maxstep – float
Maximum step size in m
Restraining step size should prevent flying off the true path
minstep – float
Minimum step size in m
Prevent calculation from taking too long by forcing a minimum step size
maxsteps – int
Maximum number of steps allowed in path tracing. Protection against getting stuck in a loop.
debug – bool
Returns a dictionary with internal quantities instead of an array with the final answer.
Returns:
float array
Length of a path orthogonal to flux surfaces from (r0, z0) to LCFS, in meters.
If grid=True: 2D float array (time by points)
If grid=False: a 1D float array vs. time
Loads a sample of equilibrium shape specifications under pulse_schedule
Parameters:
ods – ODS instance
Data go here
hrts_gate – str
One of the boundary points is for checking whether the boundary passes through the HRTS range.
But DIII-D’s HRTS can be reconfigured to handle three different ranges! So you can select
‘top’: Relevant when the HRTS channels are positioned high (this might be the default position)
‘mid’
‘low’
‘any’: goes from the bottom of the ‘low’ range to the top of the ‘top’ range.
Returns r, z points along a high resolution contour at some psi_N value
Requires skimage package, which isn’t required by OMFIT and may not be available.
THIS IS BETTER IF IT IS AVAILABLE!
Parameters:
slice_index – int
Index of the time slice of the equilibrium
psin – float
psi_N of the desired contour
zoom – float
zoom / upscaling factor
Returns:
list
Each element is a section of the contour, which might not be connected (e.g. different curve for PFR)
Each element is a 2D array, where [:, 0] is R and [:, 1] is Z
Returns r, z points along a high resolution contour at some psi_N value
Uses OMFIT util functions. 10-50% slower than skimage, but doesn’t require external dependencies.
skimage is better if you have it!
Parameters:
slice_index – int
Index of the time slice of the equilibrium
psin – float
psi_N of the desired contour
zoom – float
zoom / upscaling factor
Returns:
list
Each element is a section of the contour, which might not be connected (e.g. different curve for PFR)
Each element is a 2D array, where [:, 0] is R and [:, 1] is Z
Finds contour segments of a flux surface outboard of the midplane by a specific amount
Parameters:
slice_index – int
midplane_dr – float
Returns:
list of 2D arrays
Sections of the contour for the flux surface
The contour may have some disconnected regions in general, but if it is all simply connected,
there will be only one element in the list with a single 2D array.
Grades conformity of X-point position(s) to specifications
Parameters:
improve_xpoint_measurement – bool
simple_map – bool
For each target, just find the closest measured X-point and compare.
Requires at least as many measurements as targets and raises OmasUtilsBadInput if not satisfied.
psin_tolerance_primary – float
Tolerance in psin for declaring an X-point to be on the primary separatrix
Returns:
list of dicts
One list element per time-slice.
Within each dictionary, keys give labels for X-point targets, and values
Grades proximity of boundary to target boundary points for a specific time-slice of the equilibrium
Strike points are included and treated like other boundary points
(except they get a note appended to their name when forming labels).
There can be differences in the exact wall measurements (such as attempts to
account for toroidal variation, etc.) that cause a strike point measurement
or specification to not be on the reported wall. Since I can’t think of a way
to enforce consistent interpretation of where the wall is between the strike
point specification and the measured equilibrium, I’ve settled for making sure
the boundary goes through the specified strike point. This could go badly if
the contour segment passing through the specified strike point were disconnected
from the core plasma, such as if it limited or something. I’m hoping to get away
with this for now, though.
Parameters:
slice_index – int
Number of the time-slice to consider for obtaining the LCFS.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Class for handling the netcdf statefile from ONETWO,
with streamlining for plotting and summing heating terms
Parameters:
filename – The location on disk of the statefile
verbose – Turn on printing of debugging messages for this object
persistent – Upon loading, this class converts some variables from the
psi grid to the rho grid, but it only saves these variables back to the
statefile if persistent is True, which is slower
Dictionary having non-zero power flow terms, including the total;
keys of the dictionary end in e or i to indicate electron or ion
heating; units are MW
Dictionary having non-zero heating terms, including the total;
keys of the dictionary end in e or i to indicate electron or ion
heating; units are MW/m^3
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Return the parts of NAMELIS2 that are needed by ONETWO or FREYA to describe the beams
Parameters:
ods – An ODS object that contains beam information in the nbi IDS
t – (ms) The time at which the beam parameters should be determined
device – The device this is being set up for
nml2 – A namelist object pointing to inone[‘NAMELIS2’], which will be modified in place
smooth_power – A tuple of smooth_power, smooth_power_time
This ONETWO_beam_params_from_ods function returns beam power for a single time slice. Smoothing the beams allows for accounting for a more time integrated effect from any instantaneous changes of the beam powers. If the smooth_power is not passed in, the beams are still smoothed. If calling this function multiple times for different times, t, for the same shot, then it makes sense to smooth the powers outside of this function. For backward compatibility, if only smooth_power is given then smooth_power_time is assumed to be btime (ods[‘nbi.time’]).
time_avg – (ms) If smooth_power is not given, then the beam power is causally smoothed using time_avg for the
window_size of smooth_by_convolution. time_avg is also used to determine how far back the beams should be reported as
being on
Class accesses Osborne fits stored in MDSplus, and provides convenient methods
for accessing the data or fitting functions used in those fits, including
handling the covariance of the fitting parameters
Parameters:
server – The device (really MDSplus archive)
treename – The tree where the Osborne-tool profiles are stored
shot – The shot number
time – The timeid of the profile
runid – The runid of the profile
Note that this class assumes that the profiles are stored as
[tree][‘PROFDB_PED’][‘P<time>_<runid>’]
Remap the disparate psinorm grids for each vaiable onto the same psinorm grid
Parameters:
npoints – number of points to remap the originally 256 grid
- If points is int: make an evenly spaced array.
- If points is array: use this as the grid
- If points is a string: use the ‘psinorm’ from that item in the pfile (by default ne[‘psinorm’])
**kw – addidional keywords are passed to scipy.interpolate.interp1d
return: The entire object remapped to the same psinorm grid
Several types of files are recognized:
- Type-F: F-coil patch panel in ASCII archival format.
- Type-P: F-coil patch panel in binary format. Can be converted to type-F by
changing .patch_type attribute and executing .add_c_coil() method. May be obsolete.
Type-I: I&C coil patch panel in ASCII format.
Parameters:
filename – string [optional if shot is provided]
Filename of original source file, including path. This will be preserved as
self.source_file, even if the class updates to a temporary file copy.
If shot is provided, filename controls output only; no source file is read.
In this case, filename need not include path.
shot – int [optional if filename is provided]
Shot number to use to look up patch data. Must provide patch_type when using shot.
If a filename is provided as well, shot will be used for lookup and filename will control output only.
patch_type – None or string
None lets the class auto-assign, which should be fine.
You can force it explicitly if you really want to.
debug_topic – string
Topic keyword to pass to printd. Allows all printd from the class to be consistent.
auto_clean – bool
Run cleanup method after parsing. This will remove some problems,
but prevent exact recovery of problematic original contents.
fpconvert – bool
Automatically convert type P files into type F so they can be saved.
server – string [optional if running in OMFIT framework]
Complete access instruction for server that runs viwhed, like “eldond@iris.gat.com:22”
tunnel – string [optional if running in OMFIT framework]
Complete access instruction for tunnel used to reach server, like “eldond@cybele.gat.com:2039”.
Use empty string ‘’ if no tunnel is needed.
work_dir – string [optional if running in OMFIT framework]
Local working directory (temporary/scratch) for temporary files related to remote executable calls
remote_dir – string [optional if running in OMFIT framework]
Remote working directory (temporary/scratch) for temporary files related to remote executable calls
default_patch_type – string
Patch panel type to assign if auto detection fails, such as if you’re initializing a blank patch panel file.
If you are reading a valid patch panel file, auto detection will probably work because it is very good.
Please choose from ‘F’ or ‘I’. ‘P’ is a valid patch_type, but it’s read-only so it’s a very bad choice for
initializing a blank file to fill in yourself.
Adds an entry for C-coils to type-F patch panels, which is useful if converted from type-P.
Type-P files don’t have a C-coil entry.
:param d_supply: bool
True: Add the default D supply on C-coil setup.
False: Add default C-coil setup with no F-coil supplies on the C-coils (C-coil entry in F is blank)
__setitem__ notifies parent object of new assignment so values index/name pairs can be kept self consistent.
Without this intervention, a list’s element can be changed without changing the list itself and so the parent
OMFITpatchObject instance gets no notification, and lists of chopper indices and chopper names can get out of sync.
Returns list of MDSplus model_tree_quantities for all species.
Parameters:
warn – [bool] If True, the function will warn if some of the model_tree_quantities are missing in
OMFIT-source/omfit/omfit_classes/omfit_profiles.py and the model tree should be updated
no_update – [bool] If True, the function will return only items that is in the object AND on the model
tree, and ignore items that is not in model_tree_quantities.
Checks that basic/standard attrs are present. If not, they will be fill with standby values (usually ‘unknown’)
Also checks that ints are ints and not int64, which would prevent json from working properly.
Parameters:
quiet – If set to True, the function will not print warnings. By default set to False.
This script writes the OMFITproflies datase to DIII-D MDSplus and updates d3drdb accordingly
Parameters:
server – MDSplus server
shot – shot to store the data to
treename – MDSplus treename
skip_vars – variables to skip uploading. Array-like
relaxed – if set to True, then the function will only try to upload vars in the model_tree_quantities
list as recorded at the beginging of this file. If False, then this funct will attempt to upload all
variables stored in self, and fail if a profile variable cannot be uploaded (usually due there not being a
corresponding node on the MDSplus tree).
(bool) (commit) – If set to False, the SQL query will not commit the data to the coderunrdb. This is required to be
false for a jenkins test or else if it tries to write data to SQL database twice it will throw an error.
iri_upload_metadata – optionally, a dictionary with metadata for upload to iri_upload_log table in
the code run RDB. Certain metadata are determined dynamically. If None, then it will not be logged
to iri_upload_metadata.
Converts strings OMFITprofiles dataset keys to MDSplus nodes less than 12 chars long
Parameters:
inv – string to which to apply the transformation
if None the transformation is applied to all of the OMFITprofiles.model_tree_quantities for sanity check
reverse – reverse the translation. Used to tranlate MDSplus node names back to OMFITprofile names
Returns:
transformed sting or if inv is None the mapped_model_2_mds and mapped_mds_2_model dictionaries
eq – ODS() or dict. (OMFITtree() is a dict) Needs to contain equilibria information, either in the form
of the ODS with needed eq already loaded, or as OMFITgeqdsk() objects in the Dict with the time[ms] as
keys. Times for the eq need to be strict matches to profiles times coord.
times – array like. time for which you would like p files to be generated.
shot – int. shot number, only relevant in generating p file names.
Returns:
OMFITtree() containing a series of OMFITpfile objs.
The Coulomb logarithm: the ratio of the maximum impact parameter to the
classical distance of closest approach in Coulomb scattering.
Lambda, the argument, is known as the plasma parameter.
Formula: ln Lambda = 17.3 -
Collisionality from J.D. Huba, “NRL FORMULARY”, 2011.
:param s: string. Species.
:param s2: string. Colliding species (default elecctrons). Currently not used
:return: Dataset
Calculate gyrofrequency at the LFS midplane.
:param s: string. Species.
:param mag_field: external structure generated from OMFITlib_general.mag_field_components
:param relativistic: bool. Make a relativistic correction for m_e (not actually the relativistic mass).
:return : Dataset.
Calculate X-mode R and L cutoffs.
Note, O-mode cutoff is already stored as omega_plasma_e.
:param s: string. Species.
:param relativistic: bool. Make a relativistic correction for m_e (not actually the relativistic mass).
:return : Dataset.
Radial electric field.
Formula:
:param s: Species which will be used to calculate Er
:param mag_field: external structure generated from OMFITlib_general.mag_field_components
Poloidal velocity from Kim, Diamond Groebner, Phys. Fluids B (1991)
with poloidal in-out correction based on Ashourvan (2017)
:return: Places poloidal velocity for main-ion and impurity on outboard midplane into
DERIVED. Places neoclassical approximation for main-ion toroidal flow based on measured
impurity flow in self.
Pass through implementation of Dataset.reset_coords(). Given names of coordinates, convert them to variables.
Unlike Dataset.reset_corrds(), however, this function modifies in place!
param names: Names of coords to reset. Cannot be index coords. Default to all non-index coords.
param drop: If True, drop coords instead of converting. Default False.
Class for dynamic calculation of derived quantities
Examples:
Initialize the class with a filename and FIT Dataset.
>> tmp=OMFITprofiles(‘test.nc’, fits=root[‘OUTPUTS’][‘FIT’], equilibrium=root[‘OUTPUTS’][‘SLICE’][‘EQ’], root[‘SETTINGS’][‘EXPERIMENT’][‘gas’])
Accessing a quantity will dynamically calculate it.
>> print tmp[‘Zeff’]
Quantities are then stored (they are not calculated twice).
>> tmp=OMFITprofiles(‘test.nc’,
Temperature of the main ion species.
Assumes it is equal to the measured ion species temperature.
If there are multiple impurity temperatures measured, it uses the first one.
This method allows execution of the script without invoking TkInter commands
Note that the TkInter commands will also be discarded for the OMFITpython scipts that this method calls
This method estimates how many prun processes will fit into the memory of the current system.
It returns one core less than possible to have a safety margin as processes that do not have
enough memory will completely freeze the session.
nprocs – number of simultaneous processes; if None, then check SLURM_TASKS_PER_NODE, and then OMP_NUM_THREADS, and finally use 4; the actual number of processors will always be checked against self.estimate_nprocs_from_available_memory and use the that if less
resultNames – name, or list of names with the variables that will be returned at the end of the execution of each script
noGUI – Disable GUI with gray/blue/green boxes showing progress of parallel run
prerun – string that is executed before each parallel execution (useful to set entries in the OMFIT tree)
postrun – string that is executed after each parallel execution (useful to gather entries from the OMFIT tree)
result_type – class of the object that will contain the prun results (e.g. OMFITtree, OMFITcollection, OMFITmcTree)
runIDs – list of strings used to name and retrieve the runs (use numbers by default)
runlabels – list of strings used to display the runs (same as runIDs by default)
no_mpl_pledge – User pledges not to call any matplotlib plotting commands as part of their scripts.
The prun will thus not need to swithch the matplotlib backend, which would close any open figures.
**kw – additional keywords will appear as local variables in the user script
Local variables that are meant to change between different calls to the script should be passed as lists of length nsteps
Returns:
Dictionary containing the results from each script execution
Execute OMFITpythonTask using inside scipy.optimize.root
Optimizes actuators to achieve targets
Any tree location in the reset list is reset on each call to self
Tree will be in the state after the final run of self, whether the
optimizer converges or not. If there is an exception, tree is reset
using the reset list.
*See regression/test_optrun.py for example usage
Parameters:
actuators – dictionary of actuator dictionaries with the following keys
‘set’: function or string tree location to set as actuator
‘init’: initial value for actuator
targets – dictionary of target dictionaries with the following keys
‘get’: function or string tree location to get current value
‘target’: value to target
‘tol’: (optional) absolute tolerance of target value
reset – list of tree locations to be reset on each iteration of optimization
prerun – string that is executed before each execution of self
*useful for setting entries in the OMFIT tree
postrun – string that is executed after each execution of self
*useful for gathering entries from the OMFIT tree
method – keyword passed to scipy.optimize.root
tol – keyword passed to scipy.optimize.root
options – keyword passed to scipy.optimize.root
postfail – string that is executed if execution of self throws an error
*useful for gathering entries from the OMFIT tree
reset_on_fail – reset the tree if an exception or keyboard interrupt occurs
**kw – additional keywords passed to self.run()
Returns:
OptimizeResult output of scipy.optimize.root,
Convergence history of actuators, targets, and errors
Python script for OMFIT plots.
Differently from the OMFITpythonTask class, the OMFITpythonPlot will not refresh the OMFIT GUIs
though the OMFIT tree GUI itself will still be updated.
Use .plot() method for overplotting (called by pressing <Shift-Return> in the OMFIT GUI)
Use .plotFigure() method for plotting in new figure (called by pressing <Return> in the OMFIT GUI)
When a single script should open more than one figure, it’s probably best to use objects
of the class OMFITpythonTask and manually handling oveplotting and opening of new figures.
To use a OMFITpythonTask object for plotting, it’s useful to call the .runNoGUI method
which prevents update of the GUIs that are open.
Execute the script and open a new figure only if no figure was already open.
Effectively, this will result in an overplot.
This method is called by pressing <Shift-Return> in the OMFIT GUI.
Function used to setup default variables in an OMFIT script (of type
OMFITpythonTask, OMFITpythonTest, OMFITpythonGUI, or OMFITpythonPlot)
Really the magic function that allows OMFIT scripts to act as a function
Parameters:
**kw – keyword parameter dictionary with default value
Returns:
dictionary with variables passed by the user
To be used as
dfv = defaultVars(var1_i_want_to_define=None, var2_i_want_to_define=None)
and then later in the script, one can use var1_i_want_to_define or var2_i_want_to_define
This is a convenience function which returns the string for the working directory of OMFIT modules (remote or local).
The returned directory string is compatible with parallel running of modules. The format used is:
[server_OMFIT_working_directory]/[projectID]/[mainsettings_runID]/[module_tree_location]-[module_runid]/[p_multiprocessing_folder]/
Parameters:
root – root of the module (or string)
server – remote server. If empty string or None, then the local working directory is returned.
This is a utility function (to be used by module developers) which can be used to execute the same command on all of the modules
Note that this script will overwrite the content of your OMFIT tree.
Parameters:
doThis – python script to execute.
In this script the following variables are defined
root contains the reference to the current module being processed,
moduleID the moduleID,
rootName the location of the module in the tree
moduleFile the module filename
deploy – save the modules back on their original location
skip – skip modules that are already in the tree
these_modules_only – list of modules ID to process (useful for development of doThis)
This function attempts to import mayavi, mayavi.mlab
while avoiding known institutional installation pitfalls
as well as known tk vs qt backend issues.
Parameters:
verbose – bool. Prints a warning message if mayavi can’t be imported
Returns:
obj. mayavi if it was successfully imported, None if not
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Pass an arbitrary custom procedure to the sql database
Parameters:
procedure – string
A string that represents the custom procedure to be called
(bool) (commit) – If set to False it will not commit the data to the coderunrdb. This should be done when running a
jenkins test, otherwise it may attempt to write data to the same shot/runid twice and throw an error.
Returns:
Dict
Output list of pyodbc rows returned by the custom query
where – string or dict
Which record or records should be deleted
NOTE that all records that satisfy this condition will be deleted!
A dict will be converted into a string of the form “key1=value1 and key2=value2 …”
data – dict
Keys are columns to update and values are values to put into those columns.
where –
dict or string
Which record or records should be updated.
NOTE that all records that satisfy this condition will be updated!
If it’s a dictionary, the columns/data condition will be concatenated with “ AND “, so that
{‘my_column’: 5.2, ‘another_col’: 7} becomes “my_column=5.2 AND another_col=7”.
A string will be used directly.
commit – bool
Commit update to SQL. Set to false for testing without editing the database.
overwrite –
bool or int
0/False: If any of the keys in data already have entries in the table, do not update anything
1/True: Update everything. Don’t even check.
2: If any of the keys in dat already have entries in the table, don’t update those,
but DO write missing entries.
verbose – bool
Print SQL command being used.
Returns:
string
The SQL command that would be used.
If the update is aborted due to overwrite avoidance, the SQL command will be prefixed by “ABORT:”
Sets up an encrypted password for OMFIT to use with SQL databases on a specific server
:param server: string
The server this credential applies to
Parameters:
username – string
Username on server. If a password is specified, this defaults to os.environ[‘USER’].
If neither username nor password is specified, OMFIT will try to read both from a login file.
password – string
The password to be encrypted. Set to ‘’ to erase the exting password for username@server, if there is one.
If None, OMFIT will attempt to read it from a default login file, like .pgpass. This may or may not be the right
password for server.
guest – bool
Use guest login and save it. Each server has its own, with the default being guest // guest_pwd .
Decides which string in a list of strings (like a list of dictionary keys) best matches a search string. This is
provided so that subtle changes in the spelling, arrangement, or sanitization of headers in input.dat don’t break
the interface.
Parameters:
keys – List of keys which are potential matches to test_key
search_string – String to search for; it should be similar to an expected key name in keys.
block_id – A string with the block number, like ‘1’ or ‘3a’. If this is provided, the standard approximate
block name will be used and search_string is not needed.
case_sensitive – T/F: If False, strings will have output of .upper() compared instead of direct comparison.
Returns:
A string containing the key name which is the closest match to search_string
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Fixes the dimensions of b2.boundary.parameters arrays by setting up collect_arrays() instructions
(only if collect_arrays keyword is not already specified)
inputs: returns a list of file references for building an input deck. Includes current run and common_folder.
check_inputs: returns a list of missing files in the input deck
param filename:
string
param label:
string
param common_folder:
bool
Flag this run as a common folder. It is not a real run. Instead, it holds common files which are shared
between the run folders.
param baserun:
bool [deprecated]
Old name for common_folder. Used to support loading old projects. Do not use in new development.
param coupled:
bool
Order a coupled B2.5 + Eirene run instead of a B2.5 standalone run.
param debug:
bool
Activate debug mode
param key:
string [optional]
Key used in the OMFIT tree
param custom_required_files:
list of strings [optional]
Override the standard required_files list in a manner which will persist across updates to the class.
If this option is not used and the default required_files list in the class changes, class instances will
update to the new list when they are loaded from saved projects. If the customization is used, even to
assign the default list, the customized list will persist even if the default changes.
param custom_required_files_coupled:
list of strings [optional]
Similar to custom_required_files, but for additional files used in coupled B2.5 + Eirene runs
param custom_required_files_continue:
list of strings [optional]
Similar to custom_required_files, but for additional files used to continue the run after initialization
param custom_key_outputs:
list of strings [optional]
Similar to custom_required_files, but for key output files
param version:
string [optional]
Used to keep track of the SOLPS code version that should be used to run this case. Should be like ‘SOLPS5.0’
param kw:
dict
Additional keywords passed to super class or used to accept restored attributes.
Searches parent item in OMFIT tree to find sibling instances of
OMFITsolpsCase and identify them as main or common_folder. This won’t
necessarily work the first time or during init because this case
(and maybe others) won’t be in the tree yet.
Parameters:
quiet – bool
Suppress print statements, even debug
inputs – [Optional] list of references to input files
If this is None, self.inputs() will be used to obtain the list.
initial – bool
Check list vs. initialization requirements instead of continuation requirements.
If True, some intermediate files won’t be added to the list. If False, it is assumed that the run is being
continued and there will be more required files.
required_files – [Optional] list of strings
IF this is None, the default self.required_files or self.required_files+required_files_coupled will be used.
quiet – bool
Suppress print statements, even debug
Returns:
list of strings
List of missing files. If it is emtpy, then there are no missing files and everything is cool, dude.
Attempts to read SOLPS-ITER specific settings or fills in required values with assumptions
These are special settings that are used in setup checks or similar activities
:return: None
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Test case with some extra methods to help with OMFIT testing tasks
To use this class, make your own class that is a subclass of this one:
>> class TestMyStuff(OMFITtest):
>> notify_gh_status = True # Example of a test setting you can override
In the top of your file, override key test settings as needed
Test settings you can override by defining them at the top of your class:¶
param warning_level:
int
Instructions for turning some warnings into exceptions & ignoring others
-1: Make no changes to warnings
0: No exceptions from warnings
1: Ignores some math warnings related to NaNs & some which should be
benign. Exceptions for other warnings.
2: (RECOMMENDED) Ignores a small set of warnings which are probably
benign, but exceptions for everything else.
3: Exceptions for practically every warning. Only ignores some really
inconsequential ones from OMFIT.
4: No warnings allowed. Always throw exceptions!
The warnings are changed before your test starts, so you can still
override or change them in s() or __init__().
param count_figs:
bool
Enable counting of figures. Manage using collect_figs(n) after opening
n figures. The actual figure count will be compared to the expected
count (supplied by you as the argument), resulting in an AssertionError
if the count does not match.
param count_guis:
bool
Enable counting of GUIs. Manage using collect_guis(n) after opening
n GUIs. AssertionError if GUI count does not match expectation given
via argument.
param leave_figs_open:
bool
Don’t close figures at the end of each test (can lead to clutter)
param modules_to_load:
list of strings or tuples
Orders OMFIT to load the modules as indicated.
Strings: modules ID. Tuples: (module ID, key)
param report_table:
bool
Keep a table of test results to include in final report
param table_sorting_columns:
list of strings
Names of columns to use for sorting table. Passed to table’s group_by().
This is most useful if you are adding extra columns to the results table
during your test class’s __init__() and overriding tearDown() to
populate them.
param notify_gh_comment:
int
Turn on automatic report in a GitHub comment
0: off
1: always try to post or edit
2: try to post or edit on failure only
4: edit a comment instead of posting a new one if possible; only post if necessary
8: edit the top comments and append test report or replace existing report with same context
5 = behaviors of 4 and 1 (for example)
param notify_gh_status:
bool
Turn on automatic GitHub commit status updates
param gh_individual_status:
int
0 or False: No individual status contexts.
Maximum of one status report if notify_gh_status is set.
1 or True: Every test gets its own status context, including a pending
status report when the test starts.
2: Tests get their own status context only if they fail.
No pending status for individual tests.
3: Like 2 but ignores notify_gh_status and posts failed status reports
even if reports are otherwise disabled.
param gh_overall_status_after_individual:
bool
Post the overall status even if individual status reports are enabled
(set to 1 or True). Otherwise, the individual contexts replace the
overall context. The overall status will be posted if individual status
reports are set to 2 (only on failure).
param notify_email:
bool
Automatically send a test report to the user via email
param save_stats_to:
dict-like
Container for catching test statistics
param stats_key:
string [optional]
Test statistics are saved to save_stats_to[stats_key].
stats_key will be automatically generated using the subclass name and a
timestamp if left as None.
param topics_skipped:
list of strings
Provide a list of skipped test topics in order to have them included in
the test report. The logic for skipping probably happens in the setup of
your subclass, so we can’t easily generate this list here.
param omfitx:
reference to OMFITx
Provide a reference to OMFITx to enable GUI counting and closing.
This class might not be able to find OMFITx by itself, depending on how
it is loaded.
Create an instance of the class that will use the named test
method when executed. Raises a ValueError if the instance does
not have a method with the specified name.
Assert that some code raises an exception similar to the one provided.
The purpose is to bypass the way OMFIT’s .importCode() will provide a new reference to an exception
that differs from the exception received by a script that does from OMFITlib_whatever import Thing
Utility for running a set of OMFITtest-based test suites
Example usage:
>> class TestOmfitThingy(OMFITtest):
>> def test_thingy_init(self):
>> assert 1 != 0, ‘1 should not be 0’
>> manage_tests(TestOmfitThingy)
Parameters:
tests – OMFITtest instance or list of OMFITtest instances
Define tests to run
failfast – bool
Passed straight to unittest. Causes execution to stop at the first error
instead of running all tests and reporting which pass/fail.
separate – bool
Run each test suite separately and give it a separate context. Otherwise
they’ll have a single combined context.
combined_context_name – string [optional]
If not separate, override the automatic name
(‘+’.join([test.__class__.__name__) for test in tests]))
force_gh_status – bool [optional]
If None, use GitHub status post settings from the items in tests.
If True or False: force status updates on/off.
force_gh_comment – bool or int [optional]
Like force_gh_status, but for comments, and with extra options:
Set to 2 to post comments only on failure.
force_email – bool [optional]
None: notify_email on/off defined by test.
True or False: force email notifications to be on or off.
force_warning_level – int [optional]
None: warning_level defined by test.
int: Force the warning level for all tests to be this value.
print_report – bool
Print to console / command line?
there_can_be_only_one –
True, None or castable as int
This value is interpreted as a set of binary flags. So 6 should be interpreted as options 2 and 4 active.
A value of True is converted into 255 (all the bits are True, including unused bits).
A value of None is replaced by the default value, which is True or 255.
A float or string will work if it can be converted safely by int().
1: Any of the flags will activate this feature. The 1 bit has no special meaning beyond activation.
If active, old github comments will be deleted. The newest report may be retained.
2: Limit deletion to reports that match the combined context of the test being run.
4: Only protect the latest comment if it reports a failure; if the last test passed, all comments may be deleted
8: Limit scope to comments with matching username
raise_if_errors – bool
Raise an OMFITexception at the end if there were any errors
max_table_width – int
Width in columns for the results table, if applicable. If too small,
some columns will be hidden. Set to -1 to allow the table to be any
width.
set_gh_status_keywords – dict [optional]
Dictionary of keywords to pass to set_gh_status()
post_comment_to_github_keywords – dict [optional]
Dictionary of keywords to pass to post_comment_to_github(), like thread, org, repository, and token
ut_verbosity – int
Verbosity level for unittest. 1 is normal, 0 suppresses ., E, and F reports from unittest as it runs.
ut_stream – Output stream for unittest, such as StingIO() or sys.stdout
only_these – string or list of strings
Names of test units to run (with or without leading test_). Other test units will not be run. (None to run all tests)
kw – quietly ignores other keywords
Returns:
tuple containing:
list of unittest results
astropy.table.Table instance containing test results
string reporting test results, including a formatted version of the table
A context manager like catch_warnings, that copies and restores the warnings
filter upon exiting the context, with preset levels of warnings that turn some
warnings into exceptions.
Parameters:
record – specifies whether warnings should be captured by a
custom implementation of warnings.showwarning() and be appended to a list
returned by the context manager. Otherwise None is returned by the context
manager. The objects appended to the list are arguments whose attributes
mirror the arguments to showwarning().
module – to specify an alternative module to the module
named ‘warnings’ and imported under that name. This argument is only useful
when testing the warnings module itself.
level –
(int) Controls how many warnings should throw errors
-1: Do nothing at all and return immediately
0: No warnings are promoted to exceptions. Specific warnings defined in
higher levels are ignored and the rest appear as warnings, but with
‘always’ instead of ‘default’ behavior: they won’t disappear after
the first instance.
All higher warning levels turn all warnings into exceptions and then
selectively ignore some of them:
1: Ignores everything listed in level 2, but also ignores many common
math errors that produce NaN.
2: RECOMMENDED: In addition to level 3, also ignores several warnings
of low-importance, but still leaves many math warnings (divide by 0)
as errors.
3: Ignores warnings which are truly irrelevant to almost any normal
regression testing, such as the warning about not being able to make
backup copies of scripts that are loaded in developer mode. Should
be about as brutal as level 4 during the actual tests, but somewhat
more convenient while debugging afterward.
4: No warnings are ignored. This will be really annoying and not useful
for many OMFIT applications.
Specify whether to record warnings and if an alternative module
should be used other than sys.modules[‘warnings’].
For compatibility with Python 3.0, please consider all arguments to be keyword-only.
Removes old automatic test reports
:param lvl: int
Interpreted as a set of binary flags, so 7 means options 1, 2, and 4 are active.
1: Actually execute the deletion commands instead of just testing. In test mode, it returns list of dicts
containing information about comments that would be deleted.
2: Limit scope to current context (must supply contexts) and do not delete automatic comments from other context
4: Do not preserve the most recent report unless it describes a failure
8: Only target comments with matching username
Parameters:
keyword – string [optional]
The marker for deletion. Comments containing this string are gathered. The one with the latest timestamp
is removed from the list. The rest are deleted. Defaults to the standard string used to mark automatic comments.
contexts – string or list of strings [optional]
Context(s) for tests to consider. Relevant only when scope is limited to present context.
remove_all – bool
Special case: don’t exclude the latest comment from deletion because its status was already resolved. This comes
up when the test would’ve posted a comment and then immediately deleted it and just skips posting. In this case,
the actual last comment is not really the last comment that would’ve existed had we not skipped posting, so
don’t protect it.
**kw –
optional keywords passed to delete_matching_gh_comments:
thread: int [optional]
Thread#, like pull request or issue number.
Will be looked up automatically if you supply None and the current branch has an open pull request.
token: string [optional]
Token for accessing Github.
Will be defined automatically if you supply None and you
have previously stored a token using set_OMFIT_GitHub_token()
org: string [optional]
Organization on github, like ‘gafusion’.
Will be looked up automatically if you supply None and the current branch has an open pull request.
repository: string [optional]
Repository on github within the org
Will be looked up automatically if you supply None and the current branch has an open pull request.
Returns:
list
List of responses from github API requests, one entry per comment that the function attempts to delete.
If the test keyword is set, this will be converted into a list of dicts with information about the comments that
would be targeted.
Deploys a test script outside the framework. Imports will be different.
To include this in a test unit, do
>>> return_code, log_tail = run_test_outside_framework(__file__)
Parameters:
test_script – string
Path to the file you want to test.
If a test has a unit to test itself outside the framework, then this should be __file__.
Also make sure a test can’t run itself this way if it’s already outside the framework.
catch_exp_reports – bool
Try to grab the end of the log starting with exception reports.
Only works if test_script merges stdout and stderr; otherwise the exception reports will be somewhere else.
You can use with RedirectedStdStreams(stderr=sys.stdout): in your code to do the merging.
Returns:
(int, string)
Return code (0 is success)
End of output
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Perform the sum over ky spectrum
The inputs to this function should be already weighted by the intensity function
nk –> number of elements in ky spectrum
nm –> number of modes
ns –> number of species
nf –> number of fields (1: electrostatic, 2: electromagnetic parallel, 3:electromagnetic perpendicular)
Parameters:
sat_rule_in –
ky_spect – k_y spectrum [nk]
gp – growth rates [nk, nm]
ave_p0 – scalar average pressure
R_unit – scalar normalized major radius
kx0_e – spectral shift of the radial wavenumber due to VEXB_SHEAR [nk]
Helps with fetching data from the Thomson Scattering diagnostic.
It also has some other handy features to support analysis based on Thomson data:
Filter data by fractional error (e.g. throw out anything with > 10% uncertainty)
Filter data by reduced chi squared (e.g. throw out anything with redchisq > 8)
Filter data using ELM timing (e.g. only accept data if they are sampled between 50 and 99% of their local
inter-ELM period)
Parameters:
device – Device name, like ‘DIII-D’
shot – Shot number to analyze.
efitid – String describing the EFIT to use for mapping, such as ‘EFIT01’.
For DIII-D, ‘EFIT03’ and ‘EFIT04’ are recommended because they are calculated on the same time base as TS.S
revision_num – A string specifying a revision like ‘REVISION00’ or just the number like 0.
-1 Selects the “blessed” or “best” revision automatically.
-2 selects the best revision the same as in -1, but also creates a folder with raw data from all revisions.
subsystems – List of Thomson systems to do handle. For DIII-D, this can be any subset of
[‘core’, ‘divertor’, ‘tangential’].
For other devices, this setting does nothing and the systems list will be forced to [‘core’].
Set this to ‘auto_select’ to pick a setting that’s a good idea for your device.
(Currently, all non-DIII-D devices are forced to [‘core’].)
override_default_measurements – list of strings [optional]
Use this to do lightweight gathering of only a few quantities. More advanced uses, like filtering,
require all of the default quantities.
quality_filters –
Set to ‘default’ or a dictionary structure specifying settings for quality filters.
Missing settings will be set to default values (so an empty dictionary {} is a valid input here).
Top level settings in quality_filters:
remove_bad_slices: Any time-slice which is all bad measurements or any chord which is all bad will be
identified. These can be removed from the final dataset, which saves the user from carting around
bad data.
set_bad_points_to_zero: Multiply data by the okay flag, which will set all points marked as bad to 0+/-0
ip_thresh_frac_of_max: Set a threshold on Ip so that slices with low Ip (such as at the start of the
shot or during rampdown) will not pass the filter.
In addition, there are filters specified on a subsystem-by-subsystem basis. In addition to the real
subsystems, there is a ‘global_override’ subsystem, which takes precedence if its settings aren’t None.
bad_chords: array or list of bad chord indices for this subsystem. Set to empty list if no bad channels.
(Real subsystems only, no global override)
redchisq_limit: A number specifying the maximum acceptable reduced chi squared value. This refers to the
fit to Thomson’s raw pulsed and DC data signals to determine Te and ne.
frac_temp_err_hot_max: Upper limit on acceptable fractional uncertainty in Te when Te is above the
hot_cold_boundary threshold.
frac_temp_err_cold_max: Upper limit on acceptable fractional uncertainty in Te when Te is below the
hot_cold_boundary threshold.
hot_cold_boundary: Te boundary between “hot” and “cold” temperatures, which have different fractional
uncertainty limits.
frac_dens_err_max: Maximum fractional uncertainty in ne measurements.
elm_filter – Provide an instance of an ELM filtering class like OMFITelm or set to None to have
OMFITthomson set this up automatically.
efit_data – This is usually None, which instructs OMFITthomson to gather its own EFIT data to use in
mapping. However, you can pass in a dictionary with contents matching the format returned by
self.gather_efit_data() and then self.gather_efit_data() will be skipped.
allow_backup_efitid – T/F: Allow self.gather_efit_data() to choose self.efitid if it fails to find data
for the requested EFIT.
debug – bool
Debug mode saves some intermediate results in a special dictionary.
verbose – bool
Always print debugging statements. May be useful when using this class outside the framework.
Maps Thomson to the EFIT. Because most of the TS data are at the
same radial location, the interpolation of psi(r,z) onto R,Z for Thomson is
sped up by first interpolating to the R for most Thomson, then interpolating
along the resulting 1D psi(z). If there is more than one R value (such as if
tangential TS is included), the program loops over each unique R value. This
could be done with one 2D interpolation, but I think it would be slower.
Parameters:
note – Prints a note when starting mapping.
remove_efits_with_large_changes – Filter out EFIT time slices where the axis or boundary value of
un-normalized psi changes too fast. It’s supposed to trim questionable EFITs from the end, but it doesn’t
seem to keep them all out with reasonable thresholds. This feature was a nice idea and I think it’s
coded properly, but it isn’t performing at the expected level, so either leave it off or tune it up better.
ELM phase and timing data are used to select slices.
Individual data are flagged to indicate whether they passed the filter.
If any chords are completely bad, then they are just removed from the output.
Grabs Thomson data for the time window [t0-dt, d0+dt] for the sub-systems and parameters specified.
Parameters:
t0 – Center of the time window in ms
dt – Half-width of the time window in ms
systems – Thomson sub-systems to gather (like [‘core’, ‘divertor’, ‘tangential’]). If None: detect which
systems are available.
parameters – Parameters to gather (like [‘temp’, ‘density’, ‘press’])
psi_n_range – Range of psi_N values to accept
strict – Ignored (function accepts this keyword so it can be called generically with same keywords as its
counterpart in the quickCER module)
use_shifted_psi – T/F attempt to look up corrected psi_N (for profile alignment) in alt_x_path.
alt_x_path – An alternative path for gathering psi_N. This can be an OMFIT tree or a string which will
give an OMFITtree when operated on with eval(). Use this to provide corrected psi_N values after doing
profile alignment. Input should be an OMFIT tree containing trees for all the sub systems being considered
in this call to select_time_window (‘core’, ‘tangential’, etc.). Each subsystem tree should contain an array
of corrected psi_N values named ‘psin_corrected’.
comment – Optional: you can provide a string and your comment will be announced at the start of execution.
perturbation –
None or False for no perturbation or a dictionary with instructions for perturbing data for
doing uncertainty quantification studies such as Monte Carlo trials. The dictionary can have these keys:
random: T/F (default T): to scale perturbations by normally distributed random numbers
sigma: float (default 1): specifies scale of noise in standard deviations. Technically, I think you
could get away with passing in an array of the correct length instead of a scalar.
step_size: float: specifies absolute value of perturbation in data units (overrides sigma if present)
channel_mask: specifies which channels get noise (dictionary with a key for each parameter to mask
containing a list of channels or list of T/F matching len of channels_used for that parameter)
time_mask: specifies which time slices get noise (dictionary with a key for each parameter to mask
containing a list of times or list of T/F matching len of time_slices_used for that parameter)
data_mask: specifies which points get noise (overrides channel_mask and time_mask if present) (dictionary
with a key for each parameter to mask containing a list of T/F matching len of data for that parameter.
Note: this is harder to define than the channel and time lists.)
Shortcut: supply True instead of a dictionary to add 1 sigma random noise to all data.
realtime – T/F: Gather realtime data instead of post-shot analysis results.
Returns:
A dictionary containing all the parameters requested. Each parameter is given in a
dictionary containing x, y, e, and other information. x, y, and e are sorted by psi_N.
This is limited to quantities which are insensitive to ion measurements and can be reasonably estimated
from electron data only with limited assumptions about ions.
Assumptions:
Parameters:
zeff – float
Assumed effective charge state of ions in the plasma:
Z_eff=(n_i * Z_i^2 + n_b * Z_b^2) / (n_i * Z_i + n_b * Z_b)
ti_over_te – float or numeric array matching shape of te
Assumed ratio of ion to electron temperature
zi – float or numeric array matching shape of ne
Charge state of main ions. Hydrogen/deuterium ions = 1.0
mi – float or numeric array matching shape of te
Mass of main ions in amu. Deuterium = 2.0
zb – float or numeric array matching shape of ne
Charge state of dominant impurity. Fully stripped carbon = 6.0
mb – float or numeric array matching shape of ne
Mass of dominat impurity ions in amu. Carbon = 12.0
Removes bad values from arrays to avoid math errors, for use when calculating Thomson derived quantities
Parameters:
args – list of items to sanitize
kwargs – Keywords
- okay: array matching dimensions of items in args: Flag indicating whether each element in args is okay
- bad_fill_value: float: Value to use to replace bad elements
Returns:
list of sanitized items from args, followed by bad
Plots profiles of physics quantities vs. spatial position for a selected time window
Parameters:
t – float
Center of time window in ms
dt – float
Half-width of time window in ms. All data between t-dt and t+dt will be plotted.
position_type – string
Name of X coordinate. Valid options: ‘R’, ‘Z’, ‘PSI’
params – list of strings
List physics quantities to plot. Valid options are temp, density, and press
systems – list of strings or ‘all’
List subsystems to include in the plot. Choose all to use self.subsystems.
unit_convert – bool
Convert units from eV to keV, etc. so most quantities will be closer to order 1 in the core.
fig – Figure instance
Plot will be drawn using existing figure instance if one is provided. Otherwise, a new figure will be made.
axs – 1D array of Axes instances matching length of params.
Plots will be drawn using existing Axes instances if they are provided. Otherwise, new axes will be added to
fig.
Plots contours of a physics quantity vs. time and space
Parameters:
position_type – string
Select position from ‘R’, ‘Z’, or ‘PSI’
params – list of strings
Select parameters from ‘temp’, ‘density’, ‘press’, or ‘redchisq’
unit_convert – bool
Convert units from e.g. eV to keV to try to make most quantities closer to order 1
combine_data_before_contouring – bool
Combine data into a single array before calling tricontourf. This may look smoother, but it can hide the way
arrays from different subsystems are stitched together
num_color_levels – int
Number of contour levels
fig – Figure instance
Provide a Matplotlib Figure instance and an appropriately dimensioned array of Axes instances to overlay
axs – array of Axes instances
Provide a Matplotlib Figure instance and an appropriately dimensioned array of Axes instances to overlay
Returns:
Figure instance, array of Axes instances
Returns references to figure and axes used in plot
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
This class is used to query from database through tokesearch API
Parameters:
serverPicker – (string)A string designating the server to create the toksearch query on.
shots – A list of shot numbers (ints) to be fetched
signals – A dict where each key corresponds to the signal name returned by toksearch, and each entry is a list
which corresponds to a signal object fetched by toksearch. The first element of the list is the string
corresponding to each signal name, i.e. ‘PtDataSignal’, the 2nd and 3rd entries are the args (list), and keyword
args (dict) respectively. Ex) [‘PtDataSignal’,[‘ip’],{}] corresponds to a fetch for
PtData ‘ip’ signal.
datasets – A dict representing xarray datasets to be created from fetched signals.
aligners – A dict where the keys are name of the dataset to align and the entries are a corresponding list of Aligners
functions – A list of functions or executable strings to be executed in the toksearch mapping stage
where – (string) An evaluatable string (executed in namespace of record) which should return a boolean when
the record should be returned by toksearch query. This shall be used when trying to filter out shots by certain
criteria. i.e. return False when you wish to filter out a shot, when string evaluates to True.
keep – A list of strings pertaining to which attributes (signal,dataset etc.) of each record to be
returned by toksearch query default: returns all attrs in namespace record
compute_type – (string) Type of method to be used to run the pipeline. Options: ‘serial’,’spark’,’ray’
compute_type=’ray’ gives better memory usage and parallelization
return_data – (string) A string pertaining on how the data fetched by toksearch should be structured.
Options: ‘by_shot’,’by_signal’. ‘by_shot’ will return a dictionary with shots as keys, with each record namespace
stored in each entry. ‘by_signal’ will return a dictionary organized with the union of all record attrs as keys,
and a dictionary organized by shot numbers under it. default: ‘by_shot’
NOTE: When fetching ‘by_signal’ Datasets will concatinated together over all valid shots.
warn – (bool) If flag is true, the user will be warned if they are about to pull back more than 50% of their available memory and can respond accordingly. This is a safety precaution when pulling back large datasets that may cause you to run out of memory. (default: True).
use_dask – (bool) If flag is True then created datasets will be loaded using dask. Loading with dasks reduces the amount of
RAM used by saving the data to disk and only loading into memory by chunks. (default: False)
load_data – (bool) If this flag is False, then data will be transferred to disk under OMFIT current working directory, but the data will not be loaded into memory (RAM) and thus the OMFITtree will not be updated. This is to be used when fetching data too large to fit into memory. (default True).
**compute_kwargs –
keyword arguments to be passed into the toksearch compute functions
Function that takes in a signal name, and keyword arguments and puts them in correct format that the toksearch query method expects.
The signal specified is the one that the dataset is intended to be aligned with.
Parameters:
align_with – A string respresenting the name of the signal name in ‘signals’ that the dataset is to be aligned with respect to.
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Class used to interface with TRANSP input “namelist”
This is necessary because the TRANSP input namelist is not a default format
(e.g. with the update_time functionality)
Parameters:
filename – filename passed to OMFITobject class
**kw – keyword dictionary passed to OMFITobject class
If Data is one dimensional, it is plot using the matplotlib plot function.
If 2D, the default is to show the data using View2d. If a slice_axis is defined, the slices
are shown as line plots.
Extra key words are passed to the plot or View2d function used.
Parameters:
axes (Axes) – Axes in which to make the plots.
label (str) – Labels the data in the plot. If ‘LABEL’ or ‘RPLABEL’ these values are taken from the Data.
slice_axis (int) – Slices 2D data along the radial (0) or time (1) axis.
slice_at (np.ndarray) – Slices made in slice_axis. An empty list plots all available slices.
For 1D input uncertainty uarray of the data time avarge and standard deviation.
For 2D input uncertainty uarray of the profile with time avarge and standard deviation.
Example:
Assuming data in root[‘OUTPUTS’][‘TRANSP_OUTPUT’]
Derivative with respect to volume for TRANSP variables consistent with
TRANSP finite differencing methods.
See Solomon’s unvolint.pro
Parameters:
d (OMFITtranspData) – OMFITtranspData object from the MDSplus TRANSP tree.
dvol (OMFITtranspData or None) – OMFITtranspData object ‘dvol’ from the mds TRANSP tree.
If None, will be taken from the Data’s MDStree.
Returns:
dy/dV OMFITtransData object on zone-centered grid.
Example:
Assuming the root is an OMFIT TRANSP module with a loaded run.
>> mvisc = OMFITtranspData(root[‘OUTPUTS’][‘TRANSP_OUTPUT’],’MVISC’)
>> tvisc = vint(mvisc)
>> tvisc[‘DATA’][0,-1] # total viscous torque at first time step in Nm
2.0986965
>> mvisc2 = vder(tvisc)
>> np.all(np.isclose(mvisc2[‘DATA’][0,:],mvisc[‘DATA’][0,:]))
True
Volume integrate a TRANSP OMFITmdsValue object.
Currently only available for objects from the TRANSP tree.
Parameters:
d (OMFITtranspData) – OMFITtranspData object from the MDSplus TRANSP OUTPUTS.TWO_D tree.
dvol (OMFITtranspData or None) – OMFITtranspData object ‘dvol’ from the mds TRANSP tree.
If None, will be taken from Data’s MDStree.
Example:
Assuming the root is an OMFIT TRANSP module with a loaded run.
>> mvisc = OMFITtranspData(root[‘OUTPUTS’][‘TRANSP_OUTPUT’],’MVISC’)
>> tvisc = vint(mvisc)
>> tvisc[‘DATA’][0,-1] # total viscous torque at first time step in Nm
2.0986965
>> mvisc2 = vder(tvisc)
>> np.all(np.isclose(mvisc2[‘DATA’][0,:],mvisc[‘DATA’][0,:]))
True
label – String labeling the data. ‘LABEL’ or ‘RPLABEL’ are taken from TRANSP metadata.
squeeze – Bool demanding all plots be made on a single figure. Default is True for 1D data.
All other key word arguments are passed to the individual OMFITtranspData plot functions.
Plot TRANSP data, using default metadata.
If Data is one dimensional, it is plot using the matplotlib plot function.
If 2D, the default is to show the data using View2d. If a slice_axis is defined, the slices
are shown as line plots.
Extra key words are passed to the plot or View2d function used.
Parameters:
axes (Axes) – Axes in which to make the plots.
label (str) – Labels the data in the plot. If ‘LABEL’ or ‘RPLABEL’ these values are taken from the Data.
slice_axis (int) – Slices 2D data along the radial (0) or time (1) axis.
slice_at (np.ndarray) – Slices made in slice_axis. An empty list plots all available slices.
Derivative with respect to volume for TRANSP variables consistent with
TRANSP finite differencing methods.
See Solomon’s unvolint.pro
Parameters:
d (OMFITtranspData) – OMFITtranspData object from the MDSplus TRANSP tree.
dvol (OMFITtranspData or None) – OMFITtranspData object ‘dvol’ from the mds TRANSP tree.
If None, will be taken from the Data’s MDStree.
Returns:
dy/dV OMFITtransData object on zone-centered grid.
Example:
Assuming the root is an OMFIT TRANSP module with a loaded run.
>> mvisc = OMFITtranspData(root[‘OUTPUTS’][‘TRANSP_OUTPUT’],’MVISC’)
>> tvisc = vint(mvisc)
>> tvisc[‘DATA’][0,-1] # total viscous torque at first time step in Nm
2.0986965
>> mvisc2 = vder(tvisc)
>> np.all(np.isclose(mvisc2[‘DATA’][0,:],mvisc[‘DATA’][0,:]))
True
Volume integrate a TRANSP OMFITmdsValue object.
Currently only available for objects from the TRANSP tree.
Parameters:
d (OMFITtranspData) – OMFITtranspData object from the MDSplus TRANSP OUTPUTS.TWO_D tree.
dvol (OMFITtranspData or None) – OMFITtranspData object ‘dvol’ from the mds TRANSP tree.
If None, will be taken from Data’s MDStree.
Example:
Assuming the root is an OMFIT TRANSP module with a loaded run.
>> mvisc = OMFITtranspData(root[‘OUTPUTS’][‘TRANSP_OUTPUT’],’MVISC’)
>> tvisc = vint(mvisc)
>> tvisc[‘DATA’][0,-1] # total viscous torque at first time step in Nm
2.0986965
>> mvisc2 = vder(tvisc)
>> np.all(np.isclose(mvisc2[‘DATA’][0,:],mvisc[‘DATA’][0,:]))
True
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
Parse the bbb.v com.v flx.v grd.v files and generate mapper for common blocks
This is useful because the BASIS version of UEDGE used not
to require specifying the common blocks for variables.
Transition to PyUEDGE requires people to add the common block
information to the old input files.
To translate old UEDGE files to the new PyUEDGE format use:
>> OMFITuedgeBasisInput(‘old_uedge_input.txt’).convert()
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
This built in function makes use of the OMFIT utils.smooth function
to smooth over a single dimension of the data.
If the axis in question is irregular, the data is first linearly interpolated onto
a regular grid with spacing equal to the minimum step size of the irregular grid.
Parameters:
window_x – Smoothing window size in axis coordinate units.
window_len – Smoothing window size in index units. Ignored if window_x present. Enforced odd integer.
window – the type of window from ‘flat’, ‘hanning’, ‘hamming’, ‘bartlett’, ‘blackman’
flat window will produce a moving average smoothing.
axis – Dimension over which to smooth. Accepts integer (0), key (‘X0’), or name (‘TIME’).
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
The save method is supposed to be overridden by classes which use OMFITobject as a superclass.
If left as it is this method can detect if .filename was changed and if so, makes a copy from the original .filename (saved in the .link attribute) to the new .filename
NOTE: for strings that are arrays of elements one may use the following notation:
>> lines = ‘1 1.2 2.3 4.5 8*5.6’
>> values = []
>> for item in re.split(’[ | ]+’, line.strip()):
>> values.extend(tolist(namelist.interpreter(item)))
Parameters:
orig – string value element in a fortran namelist format
escaped_strings – do strings follow proper escaping
This function collects the multiple namelist arrays into a single one:
collectArrays(**{'__default__':0,# default value for whole namelist (use when no default is found)'BCMOM':{# options for specific entry in the namelist'default':3,# a default value must be defined to perform math ops (automatically set by a=... )'shape':(30,30),# this overrides automatic shape detection (automatically set by a(30,30)=...)'offset':(-10,-10),# this overrides automatic offset detection (automatically set to be the minimum of the offsets of the entries in all dimensions a(-10,-10)=...)'dtype':0}# this overrides automatic type detection (automatically set to float if at least one float is found)})
FORTRAN namelist file object, which can contain multiple namelists blocks
Parameters:
filename – filename to be parsed
input_string – input string to be parsed (preceeds filename)
nospaceIsComment – whether a line which starts without a space should be retained as a comment. If None, a “smart” guess is attempted
outsideOfNamelistIsComment – whether the content outside of the namelist blocks should be retained as comments. If None, a “smart” guess is attempted
retain_comments – whether comments should be retained or discarded
skip_to_symbol – string to jump to for the parsing. Content before this string is ignored
collect_arrays – whether arrays defined throughout the namelist should be collected into single entries
(e.g. a=5,a(1,4)=0)
multiDepth – wether nested namelists are allowed
bang_comment_symbol – string containing the characters that should be interpreted as comment delimiters.
equals – how the equal sign should be written when saving the namelist
compress_arrays – compress repeated elements in an array by using v*n namelist syntax
max_array_chars – wrap long array lines
explicit_arrays – (True,False,1) whether to place name(1) in front of arrays.
If 1 then (1) is only placed in front of arrays that have only one value.
separator_arrays – characters to use between array elements
split_arrays – write each array element explicitly on a separate line
Specifically this functionality was introduced to split TRANSP arrays
idlInput – whether to interpret the namelist as IDL code
Trace flux surfaces and calculate flux-surface averaged and geometric quantities
Inputs can be tables of PSI and Bt or an OMFITgeqdsk file
Parameters:
Rin – (ignored if gEQDSK!=None) array of the R grid mesh
Zin – (ignored if gEQDSK!=None) array of the Z grid mesh
PSIin – (ignored if gEQDSK!=None) PSI defined on the R/Z grid
Btin – (ignored if gEQDSK!=None) Bt defined on the R/Z grid
Rcenter – (ignored if gEQDSK!=None) Radial location where the vacuum field is defined ( B0 = F[-1] / Rcenter)
F – (ignored if gEQDSK!=None) F-poloidal
P – (ignored if gEQDSK!=None) pressure
rlim – (ignored if gEQDSK!=None) array of limiter r points (used for SOL)
zlim – (ignored if gEQDSK!=None) array of limiter z points (used for SOL)
gEQDSK – OMFITgeqdsk file or ODS
resolution – if int the original equilibrium grid will be multiplied by (resolution+1), if float the original equilibrium grid is interpolated to that resolution (in meters)
forceFindSeparatrix – force finding of separatrix even though this may be already available in the gEQDSK file
levels – levels in normalized psi. Can be an array ranging from 0 to 1, or the number of flux surfaces
map – array ranging from 0 to 1 which will be used to set the levels, or ‘rho’ if flux surfaces are generated based on gEQDSK
maxPSI – (default 0.9999)
calculateAvgGeo – Boolean which sets whether flux-surface averaged and geometric quantities are automatically calculated
quiet – Verbosity level
**kw – overwrite key entries
>> OMFIT[‘test’]=OMFITgeqdsk(OMFITsrc+’/../samples/g133221.01000’)
>> # copy the original flux surfaces
>> flx=copy.deepcopy(OMFIT[‘test’][‘fluxSurfaces’])
>> # to use PSI
>> mapping=None
>> # to use RHO instead of PSI
>> mapping=OMFIT[‘test’][‘RHOVN’]
>> # trace flux surfaces
>> flx.findSurfaces(np.linspace(0,1,100),mapping=map)
>> # to increase the accuracy of the flux surface tracing (higher numbers –> smoother surfaces, more time, more memory)
>> flx.changeResolution(2)
>> # plot
>> flx.plot()
packing – if levels is integer, packing of flux surfaces close to the separatrix
resolution – accuracy of the flux surface tracing
rlim – list of R coordinates points where flux surfaces intersect limiter
zlim – list of Z coordinates points where flux surfaces intersect limiter
open_flx – dictionary with flux surface rhon value as keys of where to calculate SOL (passing this will not set the sol entry in the flux-surfaces class)
Function used to generate boundary shapes based on T. C. Luce, PPCF, 55 9 (2013)
Direct Python translation of the IDL program /u/luce/idl/shapemaker3.pro
Parameters:
a – minor radius
eps – aspect ratio
kapu – upper elongation
lkap – lower elongation
delu – upper triangularity
dell – lower triangularity
zetaou – upper outer squareness
zetaiu – upper inner squareness
zetail – lower inner squareness
zetaol – lower outer squareness
zoffset – z-offset
upnull – toggle upper x-point
lonull – toggle lower x-point
npts – int
number of points (per quadrant)
doPlot – plot boundary shape construction
newsq – A 4 element array, into which the new squareness values are stored
gEQDSK – input gEQDSK to match (wins over rbbbs and zbbbs)
verbose – print debug statements
npts – int
Number of points
Returns:
dictionary with parameters to feed to the boundaryShape function
[a, eps, kapu, kapl, delu, dell, zetaou, zetaiu, zetail, zetaol, zoffset, upnull, lonull]
gEQDSK – input gEQDSK to match (wins over rbbbs and zbbbs)
verbose – print debug statements
doPlot – visualize match
precision – optimization tolerance
npts – int
Number of points
Returns:
dictionary with parameters to feed to the boundaryShape function
[a, eps, kapu, kapl, delu, dell, zetaou, zetaiu, zetail, zetaol, zoffset, upnull, lonull]