Required field 'uncompressed_page_size' was not found in serialized data! Struct:
See original GitHub issueHi please refer below error when accessing parquet kindly help . Sample file i have enclosed from haggle download and tested same below error is appearing in presto cli.
I am using 326 version …
io.prestosql.spi.PrestoException: can not read class org.apache.parquet.format.PageHeader: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@33fc99f6
at io.prestosql.plugin.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:167)
at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:378)
at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:357)
at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:275)
at io.prestosql.spi.Page.getLoadedPage(Page.java:261)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:290)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_326____20191205_193016_2.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: can not read class org.apache.parquet.format.PageHeader: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@33fc99f6
at org.apache.parquet.format.Util.read(Util.java:216)
at org.apache.parquet.format.Util.readPageHeader(Util.java:65)
at io.prestosql.parquet.reader.ParquetColumnChunk.readPageHeader(ParquetColumnChunk.java:57)
at io.prestosql.parquet.reader.ParquetColumnChunk.readAllPages(ParquetColumnChunk.java:67)
at io.prestosql.parquet.reader.ParquetReader.readPrimitive(ParquetReader.java:256)
at io.prestosql.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:310)
at io.prestosql.parquet.reader.ParquetReader.readBlock(ParquetReader.java:293)
at io.prestosql.plugin.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:161)
... 16 more
Caused by: io.prestosql.hive.$internal.parquet.org.apache.thrift.protocol.TProtocolException: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@33fc99f6
at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:1055)
at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:966)
at org.apache.parquet.format.PageHeader.read(PageHeader.java:843)
at org.apache.parquet.format.Util.read(Util.java:213)
... 23 more
Issue Analytics
- State:
- Created 4 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
Required field 'uncompressed_page_size' was not found in ...
In my own case, I was writing a custom Parquet Parser for Apache Tika and I experienced this error. It turned out that...
Read more >Required field 'uncompressed_page_size' was not ... - GitHub
Hi please refer below error when accessing parquet kindly help . Sample file i have enclosed from haggle download and tested same below...
Read more >PageHeader.isSetUncompressed_page_size - Java - Tabnine
isSetUncompressed_page_size (Showing top 6 results out of 315) ... field 'uncompressed_page_size' was not found in serialized data! Struct: " + toString());.
Read more >Spark 3.0.1 encounter parquet PageHerder IO issue - Apache
PageHeader: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.
Read more >org.apache.parquet.format.PageHeader ... - Download JAR files
Autogenerated by Thrift Compiler (0.12.0) * * DO NOT EDIT UNLESS YOU ARE SURE THAT YOU ... field 'uncompressed_page_size' was not found in...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi hashhar Thanks for the information yes I experienced same kind of problem… underlying data platform jars doesn’t support the Huge table data read, the vendor has fixed the issue no changes on presto part…
I’m experiencing the same problam on Trino 381 (on kubernetes using the community container image). Everything works fine until I activate caching for the hive connector. Then after a few successful queries, they start failing with:
Here is my hive catalog config WITHOUT caching enabled (queries work just fine):
And with WITH caching enabled (causes the exception):