Java와 Spire.Doc을 사용하여 Word 문서를 프로그래밍 방식으로 분할하는 방법

발행: 1개월 전 (2025년 12월 11일 오전 10:38 GMT+9)

5 분 소요

Source: Dev.to

Spire.Doc for Java 소개 및 설치

Spire.Doc for Java는 Microsoft Office를 설치하지 않아도 워드 문서를 생성, 쓰기, 편집, 변환 및 인쇄할 수 있도록 설계된 전문 Java 라이브러리입니다. DOC, DOCX, RTF, XML 형식을 지원합니다. 포괄적인 API를 통해 개발자는 문서를 분할, 병합 또는 콘텐츠 추출과 같은 복잡한 문서 조작을 높은 정확도로 수행할 수 있습니다.

프로젝트에 Spire.Doc for Java를 통합하려면 pom.xml(Maven)에 다음 의존성을 추가하십시오.

<dependency>
    <groupId>com.e-iceblue</groupId>
    <artifactId>spire.doc</artifactId>
    <version>13.11.2</version>
    <repository>
        <id>e-iceblue</id>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</dependency>

의존성을 추가한 후 프로젝트를 동기화하여 필요한 라이브러리를 다운로드합니다.

페이지 구분자를 기준으로 워드 문서 분할

페이지 구분자를 기준으로 문서를 분할하면 각 논리 단위(예: 장이나 보고서 섹션)가 새 페이지에서 시작될 때 이상적입니다. 이 방법은 시각적인 페이지 구분이 내용 구분과 일치할 때 잘 작동합니다.

페이지 구분자를 기준으로 분할하는 과정

Document 인스턴스를 생성하고 Document.loadFromFile() 로 원본 파일을 로드합니다.
새 워드 문서를 만들고 섹션을 추가합니다.
원본 문서의 각 섹션에 있는 모든 본문 자식 객체를 순회하면서 단락과 표를 식별합니다.
객체가 표인 경우 새 문서의 섹션에 바로 추가합니다.
객체가 단락인 경우 단락을 새 섹션에 추가한 뒤, 해당 단락의 자식 객체에서 페이지 구분자를 검사합니다.
페이지 구분자를 찾으면 단락에서 해당 구분자를 제거하고 현재 새 문서를 저장한 뒤, 다음 내용 블록을 위한 새 문서를 시작합니다.
모든 콘텐츠가 처리될 때까지 반복합니다.

Java 예제

import com.spire.doc.*;
import com.spire.doc.documents.*;

public class SplitDocByPageBreak {
    public static void main(String[] args) throws Exception {
        // Load the original document
        Document original = new Document();
        original.loadFromFile("E:\\Files\\SplitByPageBreak.docx");

        // Prepare the first output document
        Document newWord = new Document();
        Section section = newWord.addSection();
        int index = 0;

        // Traverse all sections of the original document
        for (int s = 0; s = 0) {
                                    section.getParagraphs().get(0).getChildObjects().removeAt(breakIdx);
                                    breakIdx--;
                                }
                            }
                        } else if (obj instanceof Table) {
                            // Add tables directly to the new document
                            section.getBody().getChildObjects().add(obj.deepClone());
                        }
                    }
                }

                // Save the final part
                newWord.saveToFile("output/result" + index + ".docx", FileFormat.Docx);
            }
        }
    }
}

섹션 구분자를 기준으로 워드 문서 분할

섹션 구분자를 기준으로 분할하면 특히 헤더/푸터, 페이지 방향, 기타 레이아웃이 서로 다른 문서에서 더 세밀한 제어가 가능합니다. 섹션 구분자는 자체 서식 속성을 가질 수 있는 논리적 구분을 나타냅니다.

섹션 구분자를 기준으로 분할하는 과정

Document 인스턴스를 생성하고 원본 파일을 로드합니다.
새 워드 문서를 만듭니다.
원본 문서의 모든 섹션을 순회합니다.
각 섹션을 Section.deepClone() 으로 복제합니다.
복제된 섹션을 Document.getSections().add() 로 새 문서에 추가합니다.
Document.saveToFile() 로 결과 문서를 저장합니다.

Java 예제

import com.spire.doc.*;

public class SplitDocBySectionBreak {
    public static void main(String[] args) throws Exception {
        // Load the original document
        Document original = new Document();
        original.loadFromFile("E:\\Files\\SplitBySectionBreak.docx");

        // Prepare the output document
        Document newDoc = new Document();

        // Iterate through each section and clone it into the new document
        for (int i = 0; i < original.getSections().getCount(); i++) {
            Section srcSection = original.getSections().get(i);
            Section clonedSection = (Section) srcSection.deepClone();
            newDoc.getSections().add(clonedSection);
        }

        // Save the split document
        newDoc.saveToFile("output/sectionSplit.docx", FileFormat.Docx);
    }
}

Java와 Spire.Doc을 사용하여 Word 문서를 프로그래밍 방식으로 분할하는 방법

Spire.Doc for Java 소개 및 설치

페이지 구분자를 기준으로 워드 문서 분할

Java 예제

섹션 구분자를 기준으로 워드 문서 분할

Java 예제

관련 글

Java Spring Boot를 사용한 기본 CRUD

Java ATM CLI 개발 로그 #2: 현금 이체, 멈춤?

커널 버그가 내 머신을 멈추게 했다: async-profiler 교착 상태 디버깅

JOIN FETCH가 데이터베이스 부하를 94% 감소시킨 방법: 실제 사례 연구