WorkflowPageImport: Difference between revisions
From Mark Twain in the German Language Press
mNo edit summary |
mNo edit summary |
||
| Line 62: | Line 62: | ||
==== Checking for correct upload ==== | ==== Checking for correct upload ==== | ||
* newly created pages should be contained in the list “Category:Article Pages” and in the overview table on the page | * newly created pages should be contained in the list “Category:Article Pages” and in the overview table on the page https://pressger.twainframe.org/ArticleOverview | ||
* Article Page names should correspond to the item ID | * Article Page names should correspond to the item ID | ||
* the header section should contain the standard information | * the header section should contain the standard information | ||
Revision as of 15:50, 26 September 2025
Importing Images
with Extension:Simple Batch Uploader
Preparing image files
- file name corresponds to item ID (e.g. “IA-001”)
- file extension is “.png”
- only one file per item
Uploading image files
- Special Page “BatchUpload” (https://pressger.twainframe.org/Special:BatchUpload)
- description field:
<noinclude>[[Category:Article Image]]</noinclude> - select files with explorer by clicking the button OR drag and drop files to select
- Important:
- description needs to be filled out before selecting files
- upload will start immediately after selection and description field will be cleared upon each upload
Checking for correct upload
- newly uploaded files should be contained in the list “Category:Article Images”
- File Page names should correspond to the pattern “File:item-ID.png”
Creating Article Pages
Extension:Data Transfer
Preparing page data as CSV file:
- standard Article Page has 15 properties
- first row corresponds to property titles:
- Title: item ID (will be the page title)
- ArticleInfo[date-iso]: date (YYYY-MM-DD)
- ArticleInfo[date] : date (DD mmm YYYY)
- ArticleInfo[title]: article title OR [summary]
- ArticleInfo[title-en]: translation of title OR empty
- ArticleInfo[paper]: newspaper name
- ArticleInfo[city]: city
- ArticleInfo[state]: state
- ArticleInfo[type]: article type
- ArticleInfo[words]: word count
- ArticleInfo[ref]: works referenced in article
- ArticleInfo[source]: source repository name
- ArticleInfo[source-link]: link to article in repository
- ArticleImage[file]: item ID OR empty (for non-transcribed articles)*
- Free Text: page body content (can be left out and changed later for each page individually)
- for non-transcribed articles:
{{ArticleNotIncluded}} - for transcribed articles: see Design Principles for info on formatting, what to include, and how to structure Article Pages (the section with “article notes” at the top of some pages might be easier to add individually after importing)
- for non-transcribed articles:
- Important:
- page creation is usually fast, but for CSV files with more than 300 entries it might be easier to split the file and upload separately (refer to extension maual for information on upload speed)
- field delimiter needs to be a comma (,) and string delimiter a quotation mark (")
Importing data from CSV file
- Special Page “Import CSV” (https://pressger.twainframe.org/Special:ImportCSV)
- select CSV file by clicking the button
- specify encoding that was used when creating CSV file
- specify how to treat already existing pages
- overwrite existing content (everything will be overwritten; use with caution)
- overwrite only fields contained in the file (useful to make adjustments to pages or correct mistakes in previous imports)
- skip (safest option to avoid unwanted changes)
- append to existing content (useful to make changes to the “Free Text” area; add categories, etc.)
- summary field: optional information for MediaWiki activity log; specify the type of data or scope of what was imported
Checking for correct upload
- newly created pages should be contained in the list “Category:Article Pages” and in the overview table on the page https://pressger.twainframe.org/ArticleOverview
- Article Page names should correspond to the item ID
- the header section should contain the standard information
- page content:
- pages with transcription: corresponding screenshot should be visible on the left below the header section; missing files will be indicated by a red link
- * pages without transcription: if the property “ArticleImage[file]” was in the import but left empty, the template call will have been added to the page (but missing an image file name); this line will have to be removed on the pages individually. (An alternative is to make a separate CSV for non-transcribed items and leave out the “ArticleImage[file]” property entirely, so it won't be added to these pages in the first place).
- Important:
- depending on the volume of pages created, it can be useful to check each page individually OR to only check a sample of pages and leave the rest to be checked when annotations, mark-up, or contextual information are added