Celeb_a wrong checksum

See original GitHub issue

Short description NonMatchingChecksumError when downloading celeb_a dataset

Environment information

  • Operating System: Arch Linux
  • Python version: 3.7
  • tensorflow-datasets version: 1.2.0
  • tensorflow version: 2.0.0rc0

Reproduction instructions

import tensorflow datasets as tfds
tfds.load('celeb_a')

Link to logs https://gist.github.com/EmanueleGhelfi/ea7b5684036262ff402c85e11db7dd3b

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
jzazocommented, Jan 28, 2020

I downloaded the img_align_celeba.zip directly from the Google Drive, and I was trying to set it up with tfds.load(name="celeb_a", data_dir=my_unzipped_folder), but it tells me

ConnectionError: HTTPConnectionPool(host='storage.googleapis.com', port=80): Max retries exceeded with url: /tfds-data/dataset_info/celeb_a/2.0.0/dataset_info.json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff7b7ec9190>: Failed to establish a new connection: [Errno 24] Too many open files'))

If I use tfds.builder("celeb_a") I get ResourceExhaustedError: /home/javier/tensorflow_datasets/celeb_a; Too many open files.

Could you please tell me how to load and process the dataset as easily as possible after downloading it manually? Thank you!

0reactions
Conchylicultorcommented, Nov 16, 2020

For info tensorflow-dataset 1.2.0 is a very old version. I would strongly encourage you to use more recent version.

With recent TFDS, you can manually download the data yourself. See: https://www.tensorflow.org/datasets/overview#manual_download_if_download_fails

Read more comments on GitHub >

github_iconTop Results From Across the Web

celeb_a has wrong checksum #1598 - tensorflow/datasets
I am trying to download the tfds dataset celeb_a via the download_and_prepare() method. I get a wrong checksum error when executing.
Read more >
anaconda - NotMachingCheckSumError in Tensorflow_datasets
I get the following error: NonMatchingChecksumError: Artifact https://drive.google.com/uc?export=download&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM, ...
Read more >
Reasons Your Checksum Doesn't Match the Original
This is why we recommend troubleshooting steps for when checksums don't match. ... Did you download the wrong file?
Read more >
tensorflow-datasets の celeb_a が読めない件 - Qiita
... has wrong checksum: * Expected: UrlInfo(size=1.34 GiB, ... さんの『CelebA データがtensorflow_datasetsから読み込めないので画像をローカル ...
Read more >
audtorch.datasets - Read the Docs
VoxCeleb1 is a large audio-visual data set consisting of short clips of human speech extracted from YouTube interviews with celebrities.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found