I want to get the following result images from the input image. The resulting images are surrounded by the frame which has the same border size and type, but the border rectangle size is not same. Are there any ways to do that? I think I need to detect the area surrounded by the border as the first step. But no idea. I'm trying to find it in ImageMagick.
Update 1
This is not perfect but it worked with OpenCV as below.
import cv2 as cv
def main():
image_file = '/path/to/your/input/image.png'
src = cv.imread(image_file, cv.IMREAD_COLOR)
height, width, channels = src.shape
image_size = height * width
img_gray = cv.cvtColor(src, cv.COLOR_RGB2GRAY)
retval, dst = cv.threshold(img_gray, 1000, 255, cv.THRESH_TOZERO_INV)
dst = cv.bitwise_not(dst)
retval, dst = cv.threshold(dst, 0, 255, cv.THRESH_BINARY | cv.THRESH_OTSU)
dst, contours, hierarchy = cv.findContours(
dst, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
xxx = 0
for i, contour in enumerate(contours):
area = cv.contourArea(contour)
if area < 50000:
continue
if image_size * 0.99 < area:
continue
if abs(i - xxx) < 10:
continue
xxx = i
x, y, w, h = cv.boundingRect(contour)
cut = src[y:y+h, x:x+w]
detector = cv.FastFeatureDetector_create()
detector.setNonmaxSuppression(False)
keypoints = detector.detect(cut)
cv.imwrite('debug_%d.png' % i, cut)
if __name__ == '__main__':
main()
Refer from this site:https://angular.io/guide/providers
Update 2
fmw42's way is great but it is not sufficient for my requirement as the following. (I did not mention in the first post) The only blue rectangle is extracted. It it possible that the background color is white.
This can be done in ImageMagick (6) using -connected-components.
Here I convert to HSV colorspace and extract the Saturation channel. White and black have no saturation, but the pink and blue do. I then threshold so that the pink and blue become white on a black background. I then use morphology erode to remove the effects of your border. Then I use connected components to fill in any holes in the white regions and then get their bounding boxes and store in an array. I then loop over each bounding box and crop the original image.
See https://imagemagick.org/script/connected-components.php
Input:
Unix Syntax:
bboxArr=(`convert wikipedia.png \
-colorspace HSV -channel 1 -separate +channel \
-threshold 0 -type bilevel \
-morphology erode square:3 \
-define connected-components:verbose=true \
-define connected-components:mean-color=true \
-define connected-components:area-threshold=1000 \
-connected-components 4 null: | grep "gray(255)" | awk '{print $2}'`)
num=${#bboxArr[*]}
for ((i=0; i<num; i++)); do
convert wikipedia.png -crop ${bboxArr[$i]} +repage wikipedia_$i.png
done
Results:
If using ImageMagick 7, then change convert to magick.
Windows syntax will need to remove the \ before ( and ). And also change the end of line \ to ^. The grep and awk are Unix tools. So you may need to install such for Windows or find other ways to do that.
It worked perfectly. But please give me some more time. As you mentioned, I've confirmed that it did not work if it was white backend-color. The color is also possible in my requirement (yes I did not mention in my question. Sorry..). I'm now trying to find solutions.
Do you need all the text paragraphs? If so, then a slightly different approach is needed. First, make all the text black on a white background. Then blur the text or us morphology open to connect the text in each paragraph. The threshold. Then use connected components to find the text region bounding boxes. Then use the bounding boxes to crop the input.
Hi. No, I don't need the text paragraphs. I need to extract output1.png and output2.png. I added the detail in my question. (Update 2)