ChangelogΒΆ
Intake-esm v2021.8.17ΒΆ
Enhancements madeΒΆ
Add pydantic models to facilitate data validation #347 (@andersy005)
Maintenance and upkeep improvementsΒΆ
[pre-commit.ci] pre-commit autoupdate #355 (@pre-commit-ci)
skip cmip6_preprocessing tests for the time being #354 (@andersy005)
Bump styfle/cancel-workflow-action from 0.9.0 to 0.9.1 #348 (@dependabot)
Update pre-commit hooks #346 (@andersy005)
Bump codecov/codecov-action from 1 to 2.0.2 #345 (@dependabot)
Disable workflows on Forks #342 (@andersy005)
π Add missing test dependency #340 (@andersy005)
Code refactoring #338 (@andersy005)
Bump pre-commit/action from v2.0.2 to v2.0.3 #337 (@dependabot)
Bump styfle/cancel-workflow-action from 0.8.0 to 0.9.0 #334 (@dependabot)
Bump pre-commit/action from v2.0.0 to v2.0.2 #333 (@dependabot)
Bump styfle/cancel-workflow-action from 0.7.0 to 0.8.0 #322 (@dependabot)
π Fix CI #321 (@andersy005)
Fix Tests: Use a publicly available s3 object #318 (@andersy005)
Bump styfle/cancel-workflow-action from 0.6.0 to 0.7.0 #316 (@dependabot)
Documentation improvementsΒΆ
Docs: Execute all notebooks #341 (@andersy005)
π Enable comments in docs via sphinx-comments #326 (@andersy005)
Contributors to this releaseΒΆ
Intake-esm v2021.1.15ΒΆ
Bug FixesΒΆ
Fix memory error when computing unique values #313 (@andersy005)
Breaking ChangesΒΆ
π¦ Drop support for Python 3.6 #311 (@andersy005)
Internal ChangesΒΆ
β¬οΈ Upgrade dependencies & pin versions in CI environment #314 (@andersy005)
π Fix failing upstream-dev CI #310 (@andersy005)
DocumentationΒΆ
Update MPI catalogs for MISTRAL #308 (@aaronspring)
Contributors to this releaseΒΆ
Intake-esm v2020.12.18ΒΆ
Bug FixesΒΆ
π Disable
_requested_variables
for single variable assets #306 (@andersy005)
Internal ChangesΒΆ
Update changelog in preparation for new release #307 (@andersy005)
Use
github-activity
to update list of contributors #302 (@andersy005)Add nbqa & Update prettier commit hooks #300 (@andersy005)
Update pre-commit and GH actions #299 (@andersy005)
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@andersy005 | @dcherian | @jbusecke | @naomi-henderson | @Recalculate
Intake-esm v2020.11.4ΒΆ
FeaturesΒΆ
β¨ Support multiple variable assets/files. (GH#287) @andersy005
β¨ Add utility function for printing version information. (GH#284) @andersy005
Breaking ChangesΒΆ
π₯ Remove unnecessary logging bits. (GH#297) @andersy005
Bug FixesΒΆ
βοΈ Fix test failures. (GH#280) @andersy005
Fix TypeError bug in
.search()
method when using wildcard and regular expressions. (GH#285) @andersy005Use file like object when dealing with netcdf in the cloud. (GH#292) @andersy005
DocumentationΒΆ
π Fix ReadtheDocs documentation builds. (GH#286) @andersy005
π Migrate docs from restructured text to markdown via
myst-parsers
. (GH#296) @andersy005π¨ Refactor documentation contents & add new notebooks. (GH#298) @andersy005
Internal ChangesΒΆ
Fix import errors due to intake/intake#526. (GH#282) @andersy005
Migrate CI from CircleCI to GitHub Actions. (GH#283) @andersy005
Use mamba to speed up CI testing. (GH#293) @andersy005
Enable dependabot updates. (GH#294) @andersy005
Test against Python 3.9. (GH#295) @andersy005
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@andersy005 | @dcherian | @jbusecke | @jukent | @sherimickelson
Intake-esm v2020.8.15ΒΆ
FeaturesΒΆ
Support regular expression objects in
search()
(GH#236) @andersy005Support wildcard expresssions in
search()
(GH#259) @andersy005Expose attributes used when aggregating/combining datasets (GH#268) @andersy005
Support turning aggregations off (GH#269) @andersy005
Improve error messages (GH#270) @andersy005
Expose aggregations options passed to xarray during datasets aggregation (GH#272) @andersy005
Reset
_entries
dict after updating aggregations (GH#274) @andersy005
DocumentationΒΆ
Update
to_dataset_dict()
docstring to inform users on howcdf_kwargs
argument is used in regards to chunking (GH#278) @bonnland
Internal ChangesΒΆ
Update pre-commit hooks & GitHub actions (GH#260) @andersy005
Update badges (GH#258) @andersy005
Update upstream environment (GH#263) @andersy005
Refactor search functionality into a standalone module (GH#267) @andersy005
Fix dask/concurrent.futures parallelism (GH#271) @andersy005
Increase test coverage to ~100% (GH#273) @andersy005
Bump minimum required versions (GH#275) @andersy005
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@andersy005 | @bonnland | @dcherian | @jeffdlb | @jukent | @kmpaul | @markusritschel | @martindurant | @matt-long
Intake-esm v2020.6.11ΒΆ
FeaturesΒΆ
Add
df
property setter (GH#247) @andersy005
DocumentationΒΆ
Use Pandas sphinx theme (GH#244) @andersy005
Update documentation tutorial (GH#252) @andersy005 & @charlesbluca
Internal ChangesΒΆ
Fix anti-patterns and other bug risks (GH#251) @andersy005
Sync with intakeβs Entry unification (GH#249) @andersy005
Contributors to this releaseΒΆ
Intake-esm v2020.5.21ΒΆ
FeaturesΒΆ
Provide informative message/warnings from empty queries. (GH#235) @andersy005
Replace tqdm progressbar with fastprogress. (GH#238) @andersy005
Add
catalog_file
attribute toesm_datastore
class. (GH#240) @andersy005
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@andersy005 | @bonnland | @dcherian | @jbusecke | @jeffdlb | @kmpaul | @markusritschel
Intake-esm v2020.5.01ΒΆ
FeaturesΒΆ
Add html representation for the catalog object. (GH#229) @andersy005
Move logic for assets aggregation into
ESMGroupDataSource()
and add few basic dict-like methods (keys()
,len()
,getitem()
,contains()
) to the catalog object. (GH#194) @andersy005 & @jhamman & @kmpaulSupport columns with iterables in
unique()
andnunique()
. (GH#223) @andersy005
Bug FixesΒΆ
Internal ChangesΒΆ
Increase test coverage. (GH#222) @andersy005
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@andersy005 | @bonnland | @dcherian | @jbusecke | @jhamman | @kmpaul | @sherimickelson
Intake-esm v2020.3.16ΒΆ
FeaturesΒΆ
Add
progressbar
argument toto_dataset_dict()
. This allows the user to override the defaultprogressbar
value used during the class instantiation. (GH#204) @andersy005Enhanced search: enforce query criteria via
require_all_on
argument viasearch()
method. (GH#202) & (GH#207) & (GH#209) @andersy005 & @jbuseckeSupport relative paths for catalog files. (GH#208) @andersy005
Bug FixesΒΆ
Use raw path if protocol is
None
. (GH#210) @andersy005
Internal ChangesΒΆ
Github Action to publish package to PyPI on release. (GH#190) @andersy005
Remove unnecessary inheritance. (GH#193) @andersy005
Update linting GitHub action to run on all pull requests. (GH#196) @andersy005
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@andersy005 | @bonnland | @dcherian | @jbusecke | @jhamman | @kmpaul
Intake-esm v2019.12.13ΒΆ
FeaturesΒΆ
Add optional
preprocess
argument toto_dataset_dict()
(GH#155) @matt-longAllow users to disable dataset aggregations by passing
aggregate=False
toto_dataset_dict()
(GH#164) @matt-longAvoid manipulating dataset coordinates by using
data_vars=varname
when concatenating datasets via xarray {py:func}:~xarray.concat()
(GH#174) @andersy005Support loading netCDF assets from openDAP endpoints (GH#176) @andersy005
Add
serialize()
method to serialize collection/catalog (GH#179) @andersy005Allow passing extra storage options to the backend file system via
to_dataset_dict()
(GH#180) @bonnlandProvide informational messages to the user via Logging module (GH#186) @andersy005
Bug FixesΒΆ
Remove the caching option (GH#158) @matt-long
Preserve encoding when aggregating datasets (GH#161) @matt-long
Sort aggregations to make sure {py:func}:
~intake_esm.merge_util.join_existing
is always done before {py:func}:~intake_esm.merge_util.join_new
(GH#171) @andersy005
DocumentationΒΆ
Internal ChangesΒΆ
Simplify group loading by using
concurrent.futures
(GH#185) @andersy005
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@andersy005 | @bonnland | @dcherian | @jbusecke | @jhamman | @matt-long | @naomi-henderson | @Recalculate | @sebasblancogonz
Intake-esm v2019.10.15ΒΆ
FeaturesΒΆ
Rewrite
intake-esm
βs core based on(esm-collection-spec)
_ Earth System Model Collection specification (GH#135) @andersy005, @matt-long, @rabernat
Breaking changesΒΆ
Replaced {py:class}:
~intake_esm.core.esm_metadatastore
with {py:class}:~intake_esm.core.esm_datastore
, see the API reference for more details.intake-esm
wonβt build collection catalogs anymore.intake-esm
now expects an ESM collection JSON file as input. This JSON should conform to the Earth System Model Collection specification.
Contributors to this releaseΒΆ
(GitHub contributors page for this release)
@aaronspring | @andersy005 | @bonnland | @dcherian | @n-henderson | @naomi-henderson | @rabernat
Intake-esm v2019.8.23ΒΆ
FeaturesΒΆ
Add
mistral
data holdings tointake-esm-datastore
(GH#133) @aaronspringReplace
.csv
withnetCDF
as serialization format when saving the built collection to disk. WithnetCDF
, we can record very useful information into the global attributes of the netCDF dataset. (GH#119) @andersy005Add string representation of
ESMMetadataStoreCatalog`` object ({pr}
122`) @andersy005Automatically build missing collections by calling
esm_metadatastore(collection_name="GLADE-CMIP5")
. When the specified collection is part of the curated collections inintake-esm-datastore
. (GH#124) @andersy005In [1]: import intake In [2]: col = intake.open_esm_metadatastore(collection_name="GLADE-CMIP5") In [3]: # if "GLADE-CMIP5" collection isn't built already, the above is equivalent to: In [4]: col = intake.open_esm_metadatastore(collection_input_definition="GLADE-CMIP5")
Revert back to using official DRS attributes when building CMIP5 and CMIP6 collections. (GH#126) @andersy005
Add
.df
property for interfacing with the built collection via dataframe To maintain backwards compatiblity. (GH#127) @andersy005Add
unique()
andnunique()
methods for summarizing count and unique values in a collection. (GH#128) @andersy005In [1]: import intake In [2]: col = intake.open_esm_metadatastore(collection_name="GLADE-CMIP5") In [3]: col Out[3]: GLADE-CMIP5 collection catalogue with 615853 entries: > 3 resource(s) > 1 resource_type(s) > 1 direct_access(s) > 1 activity(s) > 218 ensemble_member(s) > 51 experiment(s) > 312093 file_basename(s) > 615853 file_fullpath(s) > 6 frequency(s) > 25 institute(s) > 15 mip_table(s) > 53 model(s) > 7 modeling_realm(s) > 3 product(s) > 9121 temporal_subset(s) > 454 variable(s) > 489 version(s) In[4]: col.nunique() resource 3 resource_type 1 direct_access 1 activity 1 ensemble_member 218 experiment 51 file_basename 312093 file_fullpath 615853 frequency 6 institute 25 mip_table 15 model 53 modeling_realm 7 product 3 temporal_subset 9121 variable 454 version 489 dtype: int64 In[4]: col.unique(columns=['frequency', 'modeling_realm']) {'frequency': {'count': 6, 'values': ['mon', 'day', '6hr', 'yr', '3hr', 'fx']}, 'modeling_realm': {'count': 7, 'values': ['atmos', 'land', 'ocean', 'seaIce', 'ocnBgchem', 'landIce', 'aerosol']}}
Bug FixesΒΆ
For CMIP6, extract
grid_label
from directory path instead of file name. (GH#127) @andersy005
Contributors to this releaseΒΆ
Intake-esm v2019.8.5ΒΆ
FeaturesΒΆ
Support building collections using inputs from intake-esm-datastore repository. (GH#79) @andersy005
Ensure that requested files are available locally before loading data into xarray datasets. (GH#82) @andersy005 and @matt-long
Split collection definitions out of config. (GH#83) @matt-long
Add
intake-esm-builder
, a CLI tool for building collection from the command line. (GH#89) @andersy005Add support for CESM-LENS data holdings residing in AWS S3. (GH#98) @andersy005
Sort collection upon creation according to order-by-columns, pass urlpath through stack for use in parsing collection filenames (GH#100) @pbranson
Bug FixesΒΆ
Fix bug in
_list_files_hsi()
to return list instead of filter object. (GH#81) @matt-long and @andersy005cesm._get_file_attrs
fixed to break loop when longeststream
is matched. (GH#80) @matt-longRestore
non_dim_coords
to data variables all the time. (GH#90) @andersy005Fix bug in
intake_esm/cesm.py
that causedintake-esm
to exclude hourly (1hr, 6hr, etc..) CESM-LE data. (GH#110) @andersy005Fix bugs in
intake_esm/cmip.py
that caused improper regular expression matching fortable_id
andgrid_label
. (GH#113) & (GH#111) @naomi-henderson and @andersy005
Internal ChangesΒΆ
Refactor existing functionality to make intake-esm robust and extensible. (GH#77) @andersy005
Add
aggregate._override_coords
function to override dim coordinates except time in case thereβs floating point precision difference. (GH#108) @andersy005Fix CESM-LE ice component peculiarities that caused intake-esm to load data improperly. The fix separates variables for
ice
component into two separate components:ice_sh
: for southern hemisphereice_nh
: for northern hemisphere
Contributors to this releaseΒΆ
Intake-esm v2019.5.11ΒΆ
FeaturesΒΆ
Add implementation for The Gridded Meteorological Ensemble Tool (GMET) data holdings (GH#61) @andersy005
Allow users to specify exclude*dirs for CMIP collections (GH#63) & (GH#62) @andersy005
Keep CMIP6
tracking_id
inmerge_keys
(GH#67) @andersy005Add implementation for ERA5 datasets (GH#68) @andersy005
Contributors to this releaseΒΆ
Intake-esm v2019.4.26ΒΆ
FeaturesΒΆ
Add implementations for
CMIPCollection
andCMIPSource
(GH#38) @andersy005Add support for CMIP6 data (GH#46) @andersy005
Add implementation for The Max Planck Institute Grand Ensemble (MPI-GE) data holdings (GH#52) & (GH#51) @aaronspring and @andersy005
Return dictionary of datasets all the time for consistency (GH#56) @andersy005
Bug FixesΒΆ
Include multiple netcdf files in same subdirectory (GH#55) & (GH#54) @naomi-henderson and @andersy005
Contributors to this releaseΒΆ
Intake-esm v2019.2.28ΒΆ
FeaturesΒΆ
Allow CMIP integration (GH#35) @andersy005
Bug FixesΒΆ
Fix bug on build catalog and move
exclude_dirs
tolocations
(GH#33) @matt-long
Internal ChangesΒΆ
Change Logger, update dev-environment dependencies, and formatting fix in input.yml (GH#31) @matt-long
Update CircleCI workflow (GH#32) @andersy005
Rename package from
intake-cesm
tointake-esm
(GH#34) @andersy005