Abstract:[Purpose] The Chinese corpus of cultural heritage is an important part of the cultural large language model. It is of great value for implementing the national cultural digitalization strategy, consolidating the new infrastructure of the cultural large language model, and promoting the marketization of cultural data elements. [Method] This paper adopted literature research and induction methods to examine the definition and category of the Chinese cultural heritage corpus from a theoretical level, identified the challenges faced by corpus construction, and proposed construction strategies. [Result] At present, this field faces multiple challenges such as insufficient high-quality corpus stock, uneven corpus quality, inconsistent corpus annotation standards, and unclear ownership of corpus data. [Conclusion] It is recommended to incorporate corpus construction into the national cultural heritage governance system, build a special national cultural heritage corpus, establish a multi-dimensional, high-precision corpus quality assessment system, form a semantically driven, co-evolutionary corpus annotation mechanism, and formulate a corpus data sharing and copyright management mechanism to strengthen the support of the Chinese corpus of cultural heritage for the cultural large language model.