Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scene auto-tagging from filename #204

Merged
merged 7 commits into from
Dec 1, 2019

Conversation

WithoutPants
Copy link
Collaborator

Adds an "Auto tag" section to the tasks settings page. Here you can tick whether to include performers, studios and tags. Clicking the button starts the auto-tagging job.

image

The auto-tagger loops through all of the performers/studios/tags in the database, and searches for scenes that match the following the name - using the following regex to handle word separators: (?: |\.|-|_)?. For each scene it finds, it adds the performer/studio/tag to the scene.

Also adds buttons to the performer/studio and tags pages which runs the auto-tag process for a specific performer/studio/tag.

image

image

Should resolve #6 and #34

@Ch00nassid
Copy link

Ch00nassid commented Nov 15, 2019

Omg i love this feature!!!! Does it spit what it tags into some sort of log?

@Leopere
Copy link
Collaborator

Leopere commented Nov 15, 2019

@WithoutPants the TravisCI build failed due to TravisCI being Travis again.

@Leopere
Copy link
Collaborator

Leopere commented Nov 15, 2019

Yep nevermind this time it wasn't.

@Leopere Leopere added feature Pull requests that add a new feature investigate Investigation needed labels Nov 15, 2019
@WithoutPants
Copy link
Collaborator Author

Could you add a seperator in front and after the name, otherwise there will be false positives.
Also rather use this as separator: sep= "[. _-]+"

regex = '(?:^|sep)' + name.replace(' ', sep) + '(?:sep|$')

Definitely see the benefit of checking before and after name, but you don't think the separator should be optional within the name? ie xxx.FirstNameLastName.mp4

@WithoutPants
Copy link
Collaborator Author

Changed to only log when performer/studio/tag added. Improved regex to prevent false positives.

@StashAppDev StashAppDev changed the base branch from master to develop November 16, 2019 15:46
@Ch00nassid
Copy link

Will this find stars and studios i might have missed? OR is this just looking for tags?

@WithoutPants
Copy link
Collaborator Author

Will this find stars and studios i might have missed? OR is this just looking for tags?

It only operates on performers, studios and tags that are in the system already but not applied to scenes.

@WithoutPants
Copy link
Collaborator Author

Since this is a manual action it would be nice if it would autotag Performers and Scenes that have already some entires.

It already does this. To clarify the above comment, it doesn't detect if there are performers/studios/tags you haven't added to your system. I may have misunderstood @Ch00nassid's original question.

@Ch00nassid
Copy link

No, you understood fine. Im worried about my current scenes having missed a current performer or studio i already have in my collections. I'd like the option to review the findings and make final approvals but this sounds like an automated "seek and destroy" or "seek and input" ;)

@bnkai
Copy link
Collaborator

bnkai commented Nov 18, 2019

From a quick test the performers name regex seems to work better than all (at least in my DB).
The studios won't match if the filename begins with Studio eg Studio.date.Performers....
Auto tagging in the tags list doesn't update the counter
The using regex info msg you print when iterating/autotagging performers is useful also for the tags,studios to at least know that something is happening

For all of them and especially for the tags that i could test better you get maximum 25 matches.

i have for example a specifc tag in 200+ filenames and it only matches 25.
i added a new performer with over of 30 scenes, only 25 matched.
autotagging again doesn't do anything to match the remainig untagged performers,scenes,etc....

EDIT
it seems that QueryByPathRegex ( in manager/task_autotag.go ) gets messed up from the getpagination filter

models/querybuilder_sql.go


func getPagination(findFilter *FindFilterType) string {
	if findFilter == nil {
		panic("nil find filter for pagination")
	}

	var page int
	if findFilter.Page == nil || *findFilter.Page < 1 {
		page = 1
	} else {
		page = *findFilter.Page
	}

	var perPage int
	if findFilter.PerPage == nil {
		perPage = 25
	} else {
		perPage = *findFilter.PerPage
	}
	if perPage > 120 {
		perPage = 120
	} else if perPage < 1 {
		perPage = 1
	}

	page = (page - 1) * perPage
	return " LIMIT " + strconv.Itoa(perPage) + " OFFSET " + strconv.Itoa(page) + " "
}

@Leopere
Copy link
Collaborator

Leopere commented Nov 19, 2019

I added this as I think it's important to this kind of stuff. #221

@bnkai
Copy link
Collaborator

bnkai commented Nov 26, 2019

once more
The regex used doesn't match studios,tags or performers if they are the first thing in the filename.
eg

WildOnCam.19.04.12.Sydney.Cole.And.Victoria.June.Lesbian.XXX.720p.MP4-KTR.mp4

studio WildOnCam not autotagged

if you create a tag called WildOnCam to try , that won't get autotagged also

if the filename is
Sydney.Cole.And.Victoria.June.Lesbian.XXX.720p.MP4-KTR.mp4
performer Sydney Cole won't get autotagged also while Victoria june will be

@WithoutPants
Copy link
Collaborator Author

Issue was due to the regex pattern not considering path separators in the scene paths. Added a test for this particular scenario. Should be ready to retest @bnkai

@bnkai
Copy link
Collaborator

bnkai commented Nov 28, 2019

I can confirm everything now seems ok

@Leopere Leopere merged commit 1704d37 into stashapp:develop Dec 1, 2019
@WithoutPants WithoutPants mentioned this pull request Feb 4, 2020
10 tasks
@WithoutPants WithoutPants deleted the auto_tag branch May 15, 2020 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Pull requests that add a new feature investigate Investigation needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tagging based on file name
4 participants