Ah... how timely...
I'm in the middle of writing a VB.NET application and the following function is to strip markup. (The app is for doing regular expressions a bit better than the others I've seen. So far it's NOT
. I'll get it there though.)
I'm not quite done, but I would REALLY like to get a couple beta testers for it if anyone is interested. Actually, I was going to post in a few days for beta testers... It requires the .NET framework.
Hopefully this can at least point you in the right direction.
Private Function removeMarkup(ByVal theBox) As Boolean
Dim strMarkUp1 As String
Dim strMarkUp2 As String
Dim strMarkUp3 As String
Dim strMarkUp8 As String
strMarkUp1 = "(<script[^>]*>[\w|\t|\r|\W]*</script>)"
strMarkUp2 = "(<style[^>]*>[\w|\t|\r|\W]*</style>)"
strMarkUp3 = "(<object[^>]*>[\w|\t|\r|\W]*</object>)"
strMarkUp8 = "(<[^<]+>)"
Try
' need to set the multiline option
Dim rmvOpts As New RegexOptions()
rmvOpts = RegexOptions.Singleline
rmvOpts = rmvOpts Or RegexOptions.Multiline
rmvOpts = rmvOpts Or RegexOptions.IgnoreCase
theBox.text = Regex.Replace(theBox.text, strMarkUp1, " " & vbCrLf, rmvOpts)
theBox.text = Regex.Replace(theBox.text, strMarkUp2, " " & vbCrLf, rmvOpts)
theBox.text = Regex.Replace(theBox.text, strMarkUp3, " " & vbCrLf, rmvOpts)
theBox.text = Regex.Replace(theBox.text, strMarkUp8, " ", rmvOpts)
Catch exp As Exception
MsgBox("We encountered and error: " & exp.Message, MsgBoxStyle.Critical, Me.Text)
End Try
End Function