问题描述
简介
我有3个表格:
设置:
id
名称
SetItem:
set_id
item_id
位置
TempSet
id
我有一个函数可以生成来自 Item
表的新随机组合。基本上,总是在成功生成后,我在 Set
表中创建一个新行,获取它的id并将所有项id添加到 SetItem
table。
$ b
问题
每次在生成新组合之前,我都会截断 TempSet
表中,将新项目ID填入此表中,并通过与 SetItem
表中的以前组合进行比较来检查相似性百分比。如果新的组合相似度大于或等于30%,我需要防止这种组合并重新生成新的组合。
相似性意味着 - 以前生成的组合的元素的存在。所以,这个想法是:
如果新生成的集合中有超过3个元素在以前生成的集合上重复,则阻止它并尝试生成另一个组合。
以下是生成新组合的函数:
CREATE DEFINER =`root` @ `localhost`函数`gen_uniq_perm_by_kw`(
comboSize INT(5),
尝试INT(3)
)RETURNS文本CHARSET utf8 SQL安全调用开始
iterat:
LOOP
DELETE
FROM
`TempSet`;
INSERT INTO`TempSet`(`id`)(
SELECT
`i`.`id`
FROM
`Item` as`i`
ORDER BY
RAND()
LIMIT comboSize
);
IF
(
SELECT
1
FROM
`SetItem`
GROUP BY
`set_id`
HAVING
sum(
CASE
当存在时(
SELECT
id
FROM
`TempSet`
WHERE
`id` =`item_id`
LIMIT 1
)THEN
1 ELSE 0
END
)/ count(1)* 100> = 30
LIMIT 1
)< 1 THEN
RETURN(SELECT GROUP_CONCAT(id SEPARATOR' - ')FROM`TempSet`);
END IF;
SET试试:=试试 - 1;
IF
try = 0 THEN
RETURN NULL;
END IF;
END LOOP iterat;
END
当我测试它时,即使新生成的组合元素没有不存在于任何其他以前生成的组合中,因此返回null。
我的问题是,我做错了什么?
我的问题是,我做错了什么?
您的SetItem表中没有任何数据。
编辑:你评论说这是错误的;你在SetItem中有300k行。
我得到了一个工作示例。看起来你不能像你正在做的那样使用标量子查询。我以这种方式工作:
如果存在,则删除功能gen_uniq_perm_by_kw;
DELIMITER ;;
CREATE DEFINER =`root` @`localhost` FUNCTION`gen_uniq_perm_by_kw`(comboSize INT,try INT)返回文本CHARSET utf8 SQL安全调用
BEGIN
iterat:
LOOP
从`TempSet`中删除;
INSERT INTO`TempSet`(`id`)
SELECT`i`.`id` FROM`Item` as`i` ORDER BY RAND()LIMIT comboSize;
IF EXISTS(
SELECT set_id,
SUM(CASE WHEN EXENTS(SELECT id FROM`TempSet` WHERE`id` =`item_id` LIMIT 1)THEN 1 ELSE 0 END )AS group_sum,
COUNT(*)AS group_count
FROM`SetItem`
GROUP BY`set_id`
HAVING group_sum * 10 / group_count< 3
)THEN
RETURN(SELECT GROUP_CONCAT(id SEPARATOR' - ')FROM`TempSet`);
END IF;
SET试试=试试 - 1;
IF try = 0 THEN
RETURN NULL;
END IF;
END LOOP iterat;
END
我也以更简单的方式工作,不使用SUM和extra子查询:
如果存在则删除功能gen_uniq_perm_by_kw;
DELIMITER ;;
CREATE DEFINER =`root` @`localhost` FUNCTION`gen_uniq_perm_by_kw`(comboSize INT,try INT)返回文本CHARSET utf8 SQL安全调用
BEGIN
iterat:
LOOP
从`TempSet`中删除;
INSERT INTO`TempSet`(`id`)
SELECT`i`.`id` FROM`Item` as`i` ORDER BY RAND()LIMIT comboSize;
IF EXISTS(
SELECT s.set_id,
COUNT(t.id)AS group_matches,
COUNT(*)AS group_count
FROM SetItem AS s LEFT OUTER JOIN TempSet AS t ON t.id = s.item_id
GROUP BY s.set_id
HAVING group_matches * 10 / group_count< 3
)THEN
RETURN(SELECT GROUP_CONCAT(id SEPARATOR' - ')FROM`TempSet`);
END IF;
SET试试=试试 - 1;
IF try = 0 THEN
RETURN NULL;
END IF;
END LOOP iterat;
END
Brief info
I have 3 tables:
Set:
id
name
SetItem:
set_id
item_id
position
TempSet:
id
I have a function that generates new random combinations from Item
table. Basically, always after successful generation, I create a new row in Set
table, get it's id and add all item ids into SetItem
table.
Problem
Every time before generating new combination I truncate the TempSet
table, fill new item ids into this table and check for similarity percentage by comparing with previous combinations in SetItem
table. if new combination similarity greater or equal to 30%, I need to prevent this combination and re-generate new combination.
Similarity means - existence of elements on previously generated combinations. So, the idea is:
if more than 3 element of newly generated set repeated on some previously generated set, prevent it and try to generate another combination.
Here is function that generates new combinations:
CREATE DEFINER = `root` @`localhost` FUNCTION `gen_uniq_perm_by_kw` (
comboSize INT ( 5 ),
tries INT ( 3 )
) RETURNS text CHARSET utf8 SQL SECURITY INVOKER BEGIN
iterat :
LOOP
DELETE
FROM
`TempSet`;
INSERT INTO `TempSet` ( `id` ) (
SELECT
`i`.`id`
FROM
`Item` AS `i`
ORDER BY
RAND( )
LIMIT comboSize
);
IF
(
SELECT
1
FROM
`SetItem`
GROUP BY
`set_id`
HAVING
sum(
CASE
WHEN EXISTS (
SELECT
id
FROM
`TempSet`
WHERE
`id` = `item_id`
LIMIT 1
) THEN
1 ELSE 0
END
) / count( 1 ) * 100 >= 30
LIMIT 1
) < 1 THEN
RETURN ( SELECT GROUP_CONCAT( id SEPARATOR '-' ) FROM `TempSet` );
END IF;
SET tries := tries - 1;
IF
tries = 0 THEN
RETURN NULL;
END IF;
END LOOP iterat;
END
When I test it, even when newly generated combination's elements doesn't exist in any other previously generated combination, it returns null as a result.
My question is, what am I doing wrong?
My question is, what am I doing wrong?
You don't have any data in your SetItem table.
Edit: You commented that this is wrong; you do have 300k rows in SetItem.
I got an example working. It appears that you can't use a scalar subquery like you're doing. I got it working this way:
DROP FUNCTION IF EXISTS gen_uniq_perm_by_kw;
DELIMITER ;;
CREATE DEFINER = `root` @`localhost` FUNCTION `gen_uniq_perm_by_kw` (comboSize INT, tries INT) RETURNS text CHARSET utf8 SQL SECURITY INVOKER
BEGIN
iterat :
LOOP
DELETE FROM `TempSet`;
INSERT INTO `TempSet` (`id`)
SELECT `i`.`id` FROM `Item` AS `i` ORDER BY RAND() LIMIT comboSize;
IF EXISTS(
SELECT set_id,
SUM(CASE WHEN EXISTS (SELECT id FROM `TempSet` WHERE `id` = `item_id` LIMIT 1) THEN 1 ELSE 0 END) AS group_sum,
COUNT(*) AS group_count
FROM `SetItem`
GROUP BY `set_id`
HAVING group_sum * 10 / group_count < 3
) THEN
RETURN (SELECT GROUP_CONCAT(id SEPARATOR '-') FROM `TempSet`);
END IF;
SET tries = tries - 1;
IF tries = 0 THEN
RETURN NULL;
END IF;
END LOOP iterat;
END
I also got it working in a simpler way, without using the SUM and extra subquery:
DROP FUNCTION IF EXISTS gen_uniq_perm_by_kw;
DELIMITER ;;
CREATE DEFINER = `root` @`localhost` FUNCTION `gen_uniq_perm_by_kw` (comboSize INT, tries INT) RETURNS text CHARSET utf8 SQL SECURITY INVOKER
BEGIN
iterat :
LOOP
DELETE FROM `TempSet`;
INSERT INTO `TempSet` (`id`)
SELECT `i`.`id` FROM `Item` AS `i` ORDER BY RAND() LIMIT comboSize;
IF EXISTS(
SELECT s.set_id,
COUNT(t.id) AS group_matches,
COUNT(*) AS group_count
FROM SetItem AS s LEFT OUTER JOIN TempSet AS t ON t.id = s.item_id
GROUP BY s.set_id
HAVING group_matches * 10 / group_count < 3
) THEN
RETURN (SELECT GROUP_CONCAT(id SEPARATOR '-') FROM `TempSet`);
END IF;
SET tries = tries - 1;
IF tries = 0 THEN
RETURN NULL;
END IF;
END LOOP iterat;
END
这篇关于在MySQL中插入时检查和防止类似的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!